Paying attention to sigma-algebras

So as part of my new resolution to start reading the books on my shelves, I recently read through Probability with Martingales.

I’d be lying if I said I fully understood all the material: It’s quite dense, and my ability to read mathematics has atrophied a lot (I’m now doing a reread of Rudin to refresh my memory). But there’s one very basic point that stuck out as genuinely interesting to me.

When introducing measure theory, it’s common to treat sigma-algebras as this annoying detail you have to suffer through in order to get to the good stuff. They’re that family of sets that it’s really annoying that it isn’t the whole power set. And we would have gotten away with it, if it weren’t for that pesky axiom of choice.

In Probability with Martingales this is not the treatment they are given. The sigma algebras are a first class part of the theory: You’re not just interested in the largest sigma algebra you can get, you care quite a lot about the structure of different families of sigma algebras. In particular you are very interested in sub sigma algebras.

Why?

Well. If I may briefly read too much into the fact that elements of a sigma algebra are called measurable sets… what are we measuring them with?

It turns out that there’s a pretty natural interpretation of sub-sigma algebras in terms of measurable functions: If you have a sigma-algebra \(\mathcal{G}\) on \(X\) and a family of measurable functions \(\{f_\alpha : X \to Y_\alpha : \alpha \in A \}\) then you can look at the the smallest sigma-algebra \(\sigma(f_\alpha) \subseteq \mathcal{G}\) for which all these functions are still measurable. This is essentially the measurable sets which we can observe by only asking questions about these functions.

It turns out that every sub sigma algebra can be realised this way, but the proof is disappointing: Given \(\mathcal{F} \subseteq \mathcal{G}\) you just consider the identify function \(\iota: (X, \mathcal{F}) \to (X, \mathcal{G})\) and \(\mathcal{G}\) is the sigma-algebra generated by this function.

One interesting special case of this is sequential random processes. Suppose we have a set of random variables \(X_1, \ldots, X_n, \ldots\) (not necessarily independent, identically distributed, or even taking values in the same set). Our underlying space then captures an entire infinite chain of random variables stretching into the future. But we are finite beings and can only actually look at what has  happened so far. This then gives us a nested sequence of sigma algebras \(\mathcal{F_1} \subseteq \ldots \subseteq \mathcal{F_n} \subseteq \ldots \) where \(\mathcal{F_n} = \sigma(X_1, \ldots, X_n)\) is the collection of things we  we can measure at time n.

One of the reasons this is interesting is that a lot of things we would naturally pose in terms of random variables can instead be posed in terms of sigma-algebras. This tends to very naturally erase any difference between single random variables and families of random variables. e.g. you can talk about independence of sigma algebras (\(\mathcal{G}\) and \(\mathcal{H}\) are independent iff for \(\mu(G \cap H) = \mu(G) \mu(H)\) for \(G \in \mathcal{G}, H \in \mathcal{H}\)) and two families of random variables are independent if and only if the generated sigma algebras are independent.

A more abstract reason it’s interesting is that it’s quite nice to see the sigma-algebras play a front and center role as opposed to this annoyance we want to forget about. I think it makes the theory richer and more coherent to do it this way.

This entry was posted in Numbers are hard on by .

I think I might have a book problem

I was talking to my colleague Daoud about the books I’m reading in my current attempts to understand statistics in a more coherent fashion (my  problem isn’t that I don’t understand a reasonable amount of statistics, it’s that I don’t have an overall framework in which I understand statistics, so my knowledge is very patchy) and happened to look up at my bookshelves.

Where I spotted an entire book on statistical inference that I have literally no recollection of buying.

It looks pretty good, too. I haven’t actually read much of it just now (I probably have in the past), but it seems decent based on a skim through.

But it drove home a thing I’m realising recently: how little of the knowledge that is contained on my bookshelves I actually know.

Looking around the shelves there are a lot of books there I haven’t really read. They fall into a bunch of categories:

For some of them, this is legit – there are a lot which I’ve partially read before abandoning that path of my life (I have a lot of books on functional analysis, set theory, etc which I’m still interested in in the abstract and can’t bring myself to get rid of but honestly I will never study again).

Some of them I’ve picked up, read as much of them as I could and “Oh god this is too hard I need something simpler” and either bought a simpler book to supplement them or abandoned the subject.

Some of them are that simpler book, and I’ve instead ended up acquiring the information a different way and they are now too basic for me.

Some of them I’ve bought only to discover that they’re terrible and not really worth reading.

Some of them are reference books where the concept of having “read” them doesn’t really apply.

A lot of them though? I think what has happened is a chain of thought that goes “I wish to learn about X” *buys book about X* “Yay. I have learned about X” (In one case this is literally true. I have an unread book about Xlib up there. Fortunately I also no longer care to learn about Xlib).

There’s an idea that’s common in some fitness communities: You shouldn’t pre-announce your fitness goals, because it gives you much the same psychological rewards as actually achieving those goals, and thus makes you less likely to put in the effort to really achieve them. I don’t know if this is true – I don’t really know enough about the psychology of reward mechanisms to say (Ooh. I should buy a book on that. Wait. No. Bad David) – but it has plausibility, and I think something like that might be happening here: Surrounding myself with books about a subject in some way satisfies my desire to learn about that subject even though no learning actually takes place.

A little of this is probably healthy and normal. But there’s a lot of this going on with my shelves. I suspect that there’s a year of reading (spare time reading rather than solid) even if I only count the books that I actually care about still learning.

So I’m going to do two things to try and change this.

OK, three things, because I’m aware that I’m pre-announcing my goals right after I said that pre-announcing your goals is a great way to not achieve them. But the third thing is “achieve my goals despite having pre-announced them through a careful application of regularly reminding myself I haven’t achieved this goal”.

The first thing: Always have a physical non-fiction book on me when leaving the house. I can’t guarantee that this will cause me to read it, but I can guarantee that time when I don’t have a book with me are times that I’m not going to be reading a book.

The second thing: I was saying the other day that I didn’t really have any mid-range goals suitable for using beeminder. All the habits I’m trying to form seem to be either not worth the additional monetary stress or ones I’m already able to achieve on my own. Well, now I’m wrong, so I no longer have the excuse to not try it, so I’m trying it. I’ve committed to reading a non-fiction book every four weeks (I’ve counted my recent read of Mathematical methods in the theory of queuing to get me started). A book every four weeks should be easy. I normally read 3-4 books per week. Granted those are fiction, where my reading rate is absurd compared to my reading rate for non-fiction, but even so. If anything I’m hoping that it will be a pessimistically low rate and I’ll be able to raise it later,

Both of these will bias me towards shorter books (the former because longer books are heavy. The latter because I can’t necessarily finish a longer book in a month). If this proves to be a problem I’ll redefine my beeminder goal in terms of a “typical” book size and count book carrying as an exercise goal. Really though, I don’t have a problem with being biased towards shorter books for now: I like short books, and I have plenty of interesting ones to keep me going for now.

I’ve no idea if these will be sufficient to sort out this problem, but hopefully they’ll be a good start.

This entry was posted in Uncategorized on by .

Write libraries, not services

This post is a tidied up version of a lightning talk I gave at Scale Summit yesterday after jonty successfully trolled me into getting up on stage.


A thing that people seemed to want to talk about at Scale Summit was “micro-services” (I have no idea what the hell they are in comparison to just normal services, but whatever) and writing service oriented architectures.

I’m here to make a request: can you not?

Services can absolutely be good things. There are a lot of cases where they are exactly what you want, but I’d argue that you shouldn’t write a service to do anything that your system is not already doing.

Instead what happens is that people starting out a project go “We’re going to need to scale. I’ve heard services are the way to scale. Lets write services!” and you end up with a service oriented architecture very early on and everyone suffers for it.

I’ve seen this at my last two companies. The first time the service oriented architecture was so bad that fortunately we had no choice but to fix it, but at my current place we still seem to be hell bent on pretending it’s a good idea so I’ve not much luck in doing so.

The problem is that when you start out on a project (whether a new product or a brand new company) you have no idea what you’re doing and you need to figure that out. This isn’t a problem, it’s entirely normal. You do some product development, write some code, and over time you start to figure it out.

Thing is, you need to change things a lot while you’re doing this, and services get in the way of that. A service boundary is one which is difficult to refactor across, and as a result if you discover you’ve cut your code-base across the wrong axis and that cut is a service you’re probably stuck with it. At the very least, fixing it becomes much harder than you want it to be and you’ll end up putting up with it for much longer than is optimal.

Similarly, it’s very hard to simultaneously make changes to both ends of a service because they’re deployed separately. The result is that you end up building up layers of backwards compatibility into the service, even when you’re in 100% control of every line of code involved in interacting with it, and the library (or libraries) you wrap around the service to talk to it tends to accumulate a lot of cruft to deal with various versions or the not-quite-right adaptations you’ve ended up making over time to cope with this inflexibility. This often makes the libraries for calling the services really awkward and unpleasant to work with.

All this is of course not even counting the extra code you had to write and extra operations overhead of having all these services.

Fortunately, it turns out that there’s an easy solution to all these problems: Don’t do it.

Your language has a perfectly good mechanism for calling code you’ve written: A function. It has a perfectly good way of grouping functions together: A module (or, if you insist, an object). You can just use those. Take the code you were going to write as a service and just call it directly. It’ll be fine. Trust me. Once the interface you’re calling has stabilised and you decide that you really want a service after all, you can just take that code and turn it into one. In the meantime, you can develop happily against it, change it if you need to, and generally have a much less stressful life than if you’d tried to build services from the get go.

This entry was posted in How do I computer? on by .

Chickpea and vegetable stew

This was mostly a “what can I make without going all the way outside to the store?” meal, but it ended up pretty tasty.

  • 3 medium onions
  • 4 medium potatoes
  • about 8 tiny sweet red and orange peppers
  • 2 cans chopped tomatoes
  • Lots of chickpeas (I’d guess 2-3 cans worth? I made them from dry the other day)
  • 3 large spoon fulls of dark tahini
  • about a tbsp cumin
  • half a tsp chipotle powder
  • a tbsp dried rosemary
  • plenty of olive oil
  • salt
  • balsamic vinegar
  • knorr vegetable stockpot

Cooking

It’s pressure cooker time!

Finely dice the onions, slice the peppers, and chop the carrots and peppers into < 1cm cubes. Put in the pressure cooker with olive oil, cumin, salt and some balsamic vinegar. Cook for 10 minutes after it’s come up to pressure.

Take it off the pressure. Add the tomatoes, tahini, chipotle, chickpeas, the stock pot and a little bit of water. Put back on the pressure and cook for another 10 minutes.

The result is… quite brown. A little orange maybe, but mostly brown (it’s the tahini that does that). It is however rather tasty. The flavours infuse really nicely into the potatoes cooked this way, and the combination of different flavours works quite well together.

This entry was posted in Food on by .

Sketch design for a sort of housing co-operative

Prior warning: This is just a thing that I was thinking about this morning. I’ve very little knowledge of the prior art around housing cooperatives, and gather broadly similar things to the specific details I’m sketching out exist, but I’ve not done the research. It’s entirely possible that the way I’ve structured this is a terrible idea for entirely obvious reasons that I’ve simply missed.

Anyway, I was thinking this morning about the fact that in many cities, including London, Mortgage payments are often cheaper than the equivalent rent. Part of this is because in many ways you’re getting more for your rent than just the home (a certain guarantee of your landlord being responsible for repairs, although some landlords are pretty terrible at it, and a much greater flexibility), and part of this is just the usual takes money to make money thing where the cost of getting the mortgage in the first place is the barrier to entry that prevents most people from benefiting from this.

I was thinking about how to design a system where you could enable people to enter into a form of home ownership without that overhead of the deposit, and was thinking about housing co-operatives, and I hit on a design for an interesting financial arrangement.

Essentially the basic premise is this: An organisation which acts as the landlord and does maintenance, charges rent, etc. as normal. However, the organisation is entirely owned by its renters. All rent you pay also buys you shares in the organisation, at a share price calculated at the beginning of each month as the average of all rents on properties held by the organisation (essentially you’re giving each renter one share per month, reweighted so that people who pay more get more shares) (note: This requires shares to be infinitely divisible. It might make more sense to instead e.g. multiply this by 1000, round down, and carry over any unallocated shares into the next month in a take a penny/leave a penny style scheme).

These shares have the normal set of purposes: They give you voting rights (possibly according to a quadratic voting scheme?), and they give you dividends. The organisation reserves a certain amount of its profits to be ploughed back into assets, allowing adding more properties to its holdings, but the rest is paid out to its shareholders on a monthly basis.

This is definitely not the same as owning property, but it has similar benefits – in particular, what you are overpaying relative to the cost of the property will mostly come back to you as dividends over the long-term (this isn’t true initially – it starts out as essentially no different from normal renting but, over time, you end up with your rent effectively going to zero and eventually negative). 

One caveat: In order to prevent this from essentially turning into an investment vehicle, there are a few rules about the ownership of shares.

  1. All shares must be held by individuals, not corporations, trusts, etc.
  2. Shares held by someone who has never rented a home from the organisation are deemed inactive. They don’t go away, but they do not receive dividends or voting rights.
  3. Shares held by someone who has not rented in the last year similarly become inactive.
  4. Individuals who have not been resident in their rented home for more than half of the last 6 months temporarily lose their voting rights (those voting rights are restored once this condition is met again).
  5. Shares are fully transferable (I expect a common pattern will be for people to sell their shares when they wish to move into accommodation not owned by the organisation). All of the above time constraints however are attached to the individual, not the shares, so the counters restart from scratch.

I haven’t thought about this in depth, but I suspect that this would be a very useful sort of organisation to have exist. Of course, it requires a certain amount of seed money, and a lot of motivation, to get started, so it will probably never happen.

This entry was posted in Uncategorized on by .