Author Archives: david

We are borg

So those of you who follow me on Twitter / are friends on Facebook already know this by now, but I figured I’d make an announcement here for the remaining 10 of you who read this via RSS (i.e. the correct way). I’d also like to talk about some of my reasoning, and some of the implications.

In June I will be joining Google (specifically the Zurich branch) to work on Knowledge Graph.

This move has not been universally popular. There are some things that Google does that have failed to endear themselves to a number of people I know (some of these I agree with. e.g. I’m definitely not a fan of the real names policy).

But… you know, they also make really good software. I don’t really acknowledge the concept of “more good than harm”, but Google do a lot of good, and I can’t help but see improving the quality of access to information for billions of people as both unambiguously good and more useful than any software I’ve worked on to date. So I’m pretty excited about that.

There is however one thing that I am legitimately quite concerned about in joining Google though: My primary experience of people joining Google is when blogs I read get a blog post saying “I’m joining Google, but don’t worry: I won’t fall into a black hole like everyone else who joins Google. I’ll definitely keep blogging” and then maybe they write one or two blog posts shortly after that and the next one after that is the one several years later where they announce that they’re leaving Google to move onto other things.

Well, I’m joining Google, but don’t worry: I won’t fall into a black hole like everyone else who joins Google. I’ll definitely keep blogging.

A colleague (I forget which one) said the other day that he wasn’t worried because he was pretty sure no power on earth could stop me from blogging. I’m not quite so confident. There have been some pretty long periods (I think the longest was 6 months?) in the past where I’ve not blogged at all, and it wouldn’t be surprising if I had another one.

I’d quite like that not to happen, but I’m not under any impression that I’m in some way special. Lots of other people who wanted to keep blogging also stopped.

One way in which I’m a bit special is that most of those blogs were purely technical, and I know that part of what stops Googlers from blogging is that it’s difficult to blog about technical things when you’re immersed in the Google ecosystem and can’t share the details without extensive clearing from the legal department. I on the other hand blog about plenty of other things – maths, feminism, fiction, voting, etc. As far as I know it should still be fine to keep blogging about all of those.

But I don’t really feel confident that that’s enough. I still haven’t entirely convinced myself that beeminder is useful (I’ve been using it to keep me reading books, but I’m not sure how much that’s helping vs just intention), but I figure I might as well give it a try. Starting beginning of May I’m going to set up a beeminder requiring me to write at least a blog post every two weeks (my normal blogging rate is more like one a week, but I figure I should give myself some slack. If I end up vastly exceeding this I may raise the rate. If this turns out to be intractable due to reasons, I may lower the rate to one a month, but I don’t think I’ll have to do that. Worst case scenario you’ll get a whole bunch more book reviews, half-baked fiction and a few “So, Switzerland. What’s up with that?” posts.

This entry was posted in life on by .

Locally compact Hausdorff spaces and the Riesz Representation Theorem

So I’m trying to get my head around the proof of the Riesz Representation Theorem (I’ve mostly got it, I’m just trying to. As part of doing this I was trying to figure out the role of the assumption that the space was a locally compact Hausdorff space: The proof generally seems to follow through with just normality (and maybe paracompactness?).

(Note: Asking questions like this is the mathematics equivalent of my asking small questions approach to learning)

Eventually I realised where it was hiding. It’s not actually in the existence part of the proof, it’s in the uniqueness: If the space is not locally compact then you can’t cover enough points with functions of compact support and thus there will be a large chunk of the space that your functions just ignore and you can use whatever measure you like there.

More detailed proof: Let \(x\) be a point with no compact neighbourhood. Then every function \(f\) of compact support has \(f(x) = 0\) as otherwise the support of \(f\) would be a compact neighbourhood of \(x\). Therefore the measure which assigns a mass of 1 to \(x\) is indistinguishable from the \(0\) measure by integrating against functions of compact support. QED

This lead me to think about the structure of locally compact subsets of topological spaces. In particular I noticed the following:

Theorem: Let \(X\) be a regular topological space. Then there is a maximal open sets \(A \subseteq X\) such that \(A\) is locally compact in the subset topology.

Proof:

Let \(A\) be the set of points with a compact neighbourhood (that is there is open \(U \ni x\) with \(\overline{U}\) compact).

Then certainly every locally compact open subset of \(X\) is contained in \(A\): Let \(B\) be such a subset and let \(x \in B\). Then there exists \(x \in U \subseteq B\) with \(\overline{U} \subseteq B\) compact (because the closure is compact it doesn’t matter whether we mean closure in \(B\) or in \(X\)). Thus by definition of \(A\), \(x \in A\).

So we need only show that \(A\) is locally compact.

Suppose \(x\) in \(A\). Then because \(X\) is regular, we have open sets \(T, V\) with \(x \in T\), \(A^c \subseteq V\) and \(T \cap V = \emptyset\).

Now. Suppose \(x \in U\) with \(\overline{U}\) compact (such exists by definition of \(A\)). Then \(x \in U \cap T\). But \(\overline{U \cap T} \subseteq \overline{T}\) so is a closed subset of a compact space and thus compact. Further, \(\overline{U \cap T} \subseteq V^c \subseteq A\). Hence \(x\) has an open neighbourhood whose compact closure is contained in \(A\). Thus \(A\) is compact with the subset topology.

QED

So essentially \(A\) is the set of points you can distinguish with functions of compact support, right?

Well. Almost.

It turns out to be relatively easy to find an example where there is a function of compact support whose support is not contained in \(A\). In order to do this we just need to construct an example where \(\overline{A}\) is compact.

To do this we’ll glue together my two favourite examples of a locally compact space and a non locally compact space. Let \(X = \mathbb{N} \cup l^\infty\). In order to distinguish the zeros, let \(\tau\) be the 0 of \(l^\infty\).

We will give this the topology generated by the following basic open sets:

  1. \(\{n\}\) for \(n \in \mathbb{N}\)
  2. \(B(x, \epsilon)\) for \(x \in l^\infty\) with \(x \neq \tau\) and \(\epsilon > 0\)
  3. \([n, \infty) \cup B(\tau, \epsilon)\) for \(n \in \mathbb{N}\) and \epsilon > 0\)

where \(B(x, \epsilon)\) is the \(l^\infty\) ball.

So essentially we’re gluing together these two spaces by treating the \(0\) of \(l^\infty\) as the “point at infinity” in the one point compactification of \(\mathbb{N}\).

Then in this case \(A = \mathbb{N}\): \(\mathbb{N}\) is a locally compact open subset of \(X\) and any point \(x \not\in \mathbb{N}\) has no compact neighbourhoods (because no open subset of \(l^\infty\) has compact closure). But \(\overline{A} = \mathbb{N} \cup \{\tau\}\) which is homeomorphic to the one point compactification of \(\mathbb{N}\) and thus compact.

This then leads us to our definition of a function whose compact support is not contained in \(A\): Let \(f(n) = \frac{1}{n}\) for \(n \in \mathbb{N}\) and \(f(x) = 0\) for \(x \in l^\infty\). Then \(f\)’s support is \(\overline{A}\) which is compact, and so \(f\) has compact support.

(Note that we could have arranged for \(\overline{A}\) to be an arbitrary compactification of \(A\) using a similar construction: Take the compactification and glue a distinct copy of \(l^\infty\) to each point at infinity)

In general the set of functions of compact support on \(X\) are a subset of the set of functions which vanish at infinity on \(A\) (that is, for \(\epsilon > 0\), \(\{x : |f(x)| \geq \epsilon\}\) is compact.

Proof: Let \(\epsilon > 0\). Then \(\{x \in X : |f(x)| \geq \epsilon\} \subseteq \mathrm{supp}(f)\) so is a closed subset of a compact space and thus compact. We thus only need to show that it is a subset of \(A\) to prove the result.

But it is a subset of \(\{x \in X : |f(x)| > \frac{1}{2}\epsilon \}\) which is an open set whose closure is contained in  \(\{x \in X : |f(x)| \geq \frac{1}{2}\epsilon \}\), which is compact. Every open set with compact closure is a subset of \(A\), so  \(\{x \in X : |f(x)| \geq \epsilon\} \subseteq A\) as desired. Thus \(f|_A\) vanishes at infinity.

QED

Is the converse true? It turns out not. The following is true however:

Given \(f : A \to \mathbb{R}\) vanishing at infinity we can extend it to a continuous function \(f: X \to \mathbb{R}\) with support contained in \(\overline{A}\).

The obvious (and only possible) definition is to extend it with \(f(x) = 0\) for \(x \not\in A\). Does this work?

For \(y \not\in A\) the set \(U = \{x : |f(x)| \geq 0\}^c\) is an open set containing \(x\) such that for \(u \in U\), \(f(u) \in B(0, \epsilon)\), so \(f\) is continuous at \(x\) as desired.

QED

 

The problem is that in general there’s no reason to expect \(\overline{A}\) to be compact. Consider for example pasting \(\mathbb{N}\) and \(l^\infty\) together and not joining them together, just treating this as a disjoint union. Then \(A = \overline{A} = \mathbb{N}\) and the extension of the function does not have compact support.

So in general we have \(C_c(A) \subseteq C_c(X) \subseteq C_0(A)\), and it’s possible to have either or both of these inclusions be equalities (to get both you just choose \(X\) to be any locally compact space so that \(A = X\)). I’m not sure it’s possible to say more about it than that.

This entry was posted in Numbers are hard on by .

Paying attention to sigma-algebras

So as part of my new resolution to start reading the books on my shelves, I recently read through Probability with Martingales.

I’d be lying if I said I fully understood all the material: It’s quite dense, and my ability to read mathematics has atrophied a lot (I’m now doing a reread of Rudin to refresh my memory). But there’s one very basic point that stuck out as genuinely interesting to me.

When introducing measure theory, it’s common to treat sigma-algebras as this annoying detail you have to suffer through in order to get to the good stuff. They’re that family of sets that it’s really annoying that it isn’t the whole power set. And we would have gotten away with it, if it weren’t for that pesky axiom of choice.

In Probability with Martingales this is not the treatment they are given. The sigma algebras are a first class part of the theory: You’re not just interested in the largest sigma algebra you can get, you care quite a lot about the structure of different families of sigma algebras. In particular you are very interested in sub sigma algebras.

Why?

Well. If I may briefly read too much into the fact that elements of a sigma algebra are called measurable sets… what are we measuring them with?

It turns out that there’s a pretty natural interpretation of sub-sigma algebras in terms of measurable functions: If you have a sigma-algebra \(\mathcal{G}\) on \(X\) and a family of measurable functions \(\{f_\alpha : X \to Y_\alpha : \alpha \in A \}\) then you can look at the the smallest sigma-algebra \(\sigma(f_\alpha) \subseteq \mathcal{G}\) for which all these functions are still measurable. This is essentially the measurable sets which we can observe by only asking questions about these functions.

It turns out that every sub sigma algebra can be realised this way, but the proof is disappointing: Given \(\mathcal{F} \subseteq \mathcal{G}\) you just consider the identify function \(\iota: (X, \mathcal{F}) \to (X, \mathcal{G})\) and \(\mathcal{G}\) is the sigma-algebra generated by this function.

One interesting special case of this is sequential random processes. Suppose we have a set of random variables \(X_1, \ldots, X_n, \ldots\) (not necessarily independent, identically distributed, or even taking values in the same set). Our underlying space then captures an entire infinite chain of random variables stretching into the future. But we are finite beings and can only actually look at what has  happened so far. This then gives us a nested sequence of sigma algebras \(\mathcal{F_1} \subseteq \ldots \subseteq \mathcal{F_n} \subseteq \ldots \) where \(\mathcal{F_n} = \sigma(X_1, \ldots, X_n)\) is the collection of things we  we can measure at time n.

One of the reasons this is interesting is that a lot of things we would naturally pose in terms of random variables can instead be posed in terms of sigma-algebras. This tends to very naturally erase any difference between single random variables and families of random variables. e.g. you can talk about independence of sigma algebras (\(\mathcal{G}\) and \(\mathcal{H}\) are independent iff for \(\mu(G \cap H) = \mu(G) \mu(H)\) for \(G \in \mathcal{G}, H \in \mathcal{H}\)) and two families of random variables are independent if and only if the generated sigma algebras are independent.

A more abstract reason it’s interesting is that it’s quite nice to see the sigma-algebras play a front and center role as opposed to this annoyance we want to forget about. I think it makes the theory richer and more coherent to do it this way.

This entry was posted in Numbers are hard on by .

I think I might have a book problem

I was talking to my colleague Daoud about the books I’m reading in my current attempts to understand statistics in a more coherent fashion (my  problem isn’t that I don’t understand a reasonable amount of statistics, it’s that I don’t have an overall framework in which I understand statistics, so my knowledge is very patchy) and happened to look up at my bookshelves.

Where I spotted an entire book on statistical inference that I have literally no recollection of buying.

It looks pretty good, too. I haven’t actually read much of it just now (I probably have in the past), but it seems decent based on a skim through.

But it drove home a thing I’m realising recently: how little of the knowledge that is contained on my bookshelves I actually know.

Looking around the shelves there are a lot of books there I haven’t really read. They fall into a bunch of categories:

For some of them, this is legit – there are a lot which I’ve partially read before abandoning that path of my life (I have a lot of books on functional analysis, set theory, etc which I’m still interested in in the abstract and can’t bring myself to get rid of but honestly I will never study again).

Some of them I’ve picked up, read as much of them as I could and “Oh god this is too hard I need something simpler” and either bought a simpler book to supplement them or abandoned the subject.

Some of them are that simpler book, and I’ve instead ended up acquiring the information a different way and they are now too basic for me.

Some of them I’ve bought only to discover that they’re terrible and not really worth reading.

Some of them are reference books where the concept of having “read” them doesn’t really apply.

A lot of them though? I think what has happened is a chain of thought that goes “I wish to learn about X” *buys book about X* “Yay. I have learned about X” (In one case this is literally true. I have an unread book about Xlib up there. Fortunately I also no longer care to learn about Xlib).

There’s an idea that’s common in some fitness communities: You shouldn’t pre-announce your fitness goals, because it gives you much the same psychological rewards as actually achieving those goals, and thus makes you less likely to put in the effort to really achieve them. I don’t know if this is true – I don’t really know enough about the psychology of reward mechanisms to say (Ooh. I should buy a book on that. Wait. No. Bad David) – but it has plausibility, and I think something like that might be happening here: Surrounding myself with books about a subject in some way satisfies my desire to learn about that subject even though no learning actually takes place.

A little of this is probably healthy and normal. But there’s a lot of this going on with my shelves. I suspect that there’s a year of reading (spare time reading rather than solid) even if I only count the books that I actually care about still learning.

So I’m going to do two things to try and change this.

OK, three things, because I’m aware that I’m pre-announcing my goals right after I said that pre-announcing your goals is a great way to not achieve them. But the third thing is “achieve my goals despite having pre-announced them through a careful application of regularly reminding myself I haven’t achieved this goal”.

The first thing: Always have a physical non-fiction book on me when leaving the house. I can’t guarantee that this will cause me to read it, but I can guarantee that time when I don’t have a book with me are times that I’m not going to be reading a book.

The second thing: I was saying the other day that I didn’t really have any mid-range goals suitable for using beeminder. All the habits I’m trying to form seem to be either not worth the additional monetary stress or ones I’m already able to achieve on my own. Well, now I’m wrong, so I no longer have the excuse to not try it, so I’m trying it. I’ve committed to reading a non-fiction book every four weeks (I’ve counted my recent read of Mathematical methods in the theory of queuing to get me started). A book every four weeks should be easy. I normally read 3-4 books per week. Granted those are fiction, where my reading rate is absurd compared to my reading rate for non-fiction, but even so. If anything I’m hoping that it will be a pessimistically low rate and I’ll be able to raise it later,

Both of these will bias me towards shorter books (the former because longer books are heavy. The latter because I can’t necessarily finish a longer book in a month). If this proves to be a problem I’ll redefine my beeminder goal in terms of a “typical” book size and count book carrying as an exercise goal. Really though, I don’t have a problem with being biased towards shorter books for now: I like short books, and I have plenty of interesting ones to keep me going for now.

I’ve no idea if these will be sufficient to sort out this problem, but hopefully they’ll be a good start.

This entry was posted in Uncategorized on by .

Write libraries, not services

I was at Scale Summit yesterday and jonty successfully trolled me into giving a lightning talk. For once I managed to do this without the crippling nervousness I usually get from public speaking (turns out the secret is being slightly drunk), so I actually managed to be reasonably coherent and people seemed to enjoy it. Woo.

Anyway, this post is a tidied up version of the talk I gave:

A thing that people seemed to want to talk about at Scale Summit was “micro-services” (I have no idea what the hell they are in comparison to just normal services, but whatever) and writing service oriented architectures.

I’m here to make a request: can you not?

Services can absolutely be good things. There are a lot of cases where they are exactly what you want, but I’d argue that you shouldn’t write a service to do anything that your system is not already doing.

Instead what happens is that people starting out a project go “We’re going to need to scale. I’ve heard services are the way to scale. Lets write services!” and you end up with a service oriented architecture very early on and everyone suffers for it.

I’ve seen this at my last two companies. The first time the service oriented architecture was so bad that fortunately we had no choice but to fix it, but at my current place we still seem to be hell bent on pretending it’s a good idea so I’ve not much luck in doing so.

The problem is that when you start out on a project (whether a new product or a brand new company) you have no idea what you’re doing and you need to figure that out. This isn’t a problem, it’s entirely normal. You do some product development, write some code, and over time you start to figure it out.

Thing is, you need to change things a lot while you’re doing this, and services get in the way of that. A service boundary is one which is difficult to refactor across, and as a result if you discover you’ve cut your code-base across the wrong axis and that cut is a service you’re probably stuck with it. At the very least, fixing it becomes much harder than you want it to be and you’ll end up putting up with it for much longer than is optimal.

Similarly, it’s very hard to simultaneously make changes to both ends of a service because they’re deployed separately. The result is that you end up building up layers of backwards compatibility into the service, even when you’re in 100% control of every line of code involved in interacting with it, and the library (or libraries) you wrap around the service to talk to it tends to accumulate a lot of cruft to deal with various versions or the not-quite-right adaptations you’ve ended up making over time to cope with this inflexibility. This often makes the libraries for calling the services really awkward and unpleasant to work with.

All this is of course not even counting the extra code you had to write and extra operations overhead of having all these services.

Fortunately, it turns out that there’s an easy solution to all these problems: Don’t do it.

Your language has a perfectly good mechanism for calling code you’ve written: A function. It has a perfectly good way of grouping functions together: A module (or, if you insist, an object). You can just use those. Take the code you were going to write as a service and just call it directly. It’ll be fine. Trust me. Once the interface you’re calling has stabilised and you decide that you really want a service after all, you can just take that code and turn it into one. In the meantime, you can develop happily against it, change it if you need to, and generally have a much less stressful life than if you’d tried to build services from the get go.

This entry was posted in How do I computer? on by .