# I just want to brag a bit about the Hypothesis test suite

There’s not much content to this post other than “look at how awesome my test suite is”. But it’s pretty awesome. It:

• Has 100% branch coverage
• Fails the build if it ever doesn’t have 100% branch coverage
• Has about a 3:2 ratio of lines of test code to library code
• Also runs the tests against a version installed from a package. This both tests the installer and also tests without coverage in case coverage changes the semantics (which it probably won’t now but definitely will in future)
• Runs across the three major python versions the library supports (2.7, 3.3, 3.4)
• Has statistical tests about the properties of the distributions it produces
• Runs a linter over the code to make sure that it’s pyflakes and pep8 compliant
• Runs a linter over the README to make sure that it parses correctly as RST
• Has a self test where it uses hypothesis to test hypothesis. This found real bugs in the library, despite the aforementioned coverage and code ratio.

This is probably the best tested code I’ve ever written, and I’m still morally certain it has at least one major bug in it.

I don’t know what that bug is of course, or I’d have fixed it, but every single time I’ve gone “Wow this code is really well tested. I bet it’s super solid” it has been followed by “Hmm. I could do this thing to prove it’s super solid. Oh look there’s a bug” .

OK, so this started out as me bragging about how great my test suite was but now I’m super depressed about the state of software all over again. I think of myself as being atypically good at producing correct code – not in some mythical 10x sense where I’m vastly better than everyone else, but I feel pretty comfortable in saying I’m better than average. This is probably the most well tested code I’ve ever written, and it’s still buggy. Where does that leave the majority of code found in the wild?

This entry was posted in How do I computer?, Hypothesis on by .

# Write libraries, not services

This post is a tidied up version of a lightning talk I gave at Scale Summit yesterday after jonty successfully trolled me into getting up on stage.

A thing that people seemed to want to talk about at Scale Summit was “micro-services” (I have no idea what the hell they are in comparison to just normal services, but whatever) and writing service oriented architectures.

I’m here to make a request: can you not?

Services can absolutely be good things. There are a lot of cases where they are exactly what you want, but I’d argue that you shouldn’t write a service to do anything that your system is not already doing.

Instead what happens is that people starting out a project go “We’re going to need to scale. I’ve heard services are the way to scale. Lets write services!” and you end up with a service oriented architecture very early on and everyone suffers for it.

I’ve seen this at my last two companies. The first time the service oriented architecture was so bad that fortunately we had no choice but to fix it, but at my current place we still seem to be hell bent on pretending it’s a good idea so I’ve not much luck in doing so.

The problem is that when you start out on a project (whether a new product or a brand new company) you have no idea what you’re doing and you need to figure that out. This isn’t a problem, it’s entirely normal. You do some product development, write some code, and over time you start to figure it out.

Thing is, you need to change things a lot while you’re doing this, and services get in the way of that. A service boundary is one which is difficult to refactor across, and as a result if you discover you’ve cut your code-base across the wrong axis and that cut is a service you’re probably stuck with it. At the very least, fixing it becomes much harder than you want it to be and you’ll end up putting up with it for much longer than is optimal.

Similarly, it’s very hard to simultaneously make changes to both ends of a service because they’re deployed separately. The result is that you end up building up layers of backwards compatibility into the service, even when you’re in 100% control of every line of code involved in interacting with it, and the library (or libraries) you wrap around the service to talk to it tends to accumulate a lot of cruft to deal with various versions or the not-quite-right adaptations you’ve ended up making over time to cope with this inflexibility. This often makes the libraries for calling the services really awkward and unpleasant to work with.

All this is of course not even counting the extra code you had to write and extra operations overhead of having all these services.

Fortunately, it turns out that there’s an easy solution to all these problems: Don’t do it.

Your language has a perfectly good mechanism for calling code you’ve written: A function. It has a perfectly good way of grouping functions together: A module (or, if you insist, an object). You can just use those. Take the code you were going to write as a service and just call it directly. It’ll be fine. Trust me. Once the interface you’re calling has stabilised and you decide that you really want a service after all, you can just take that code and turn it into one. In the meantime, you can develop happily against it, change it if you need to, and generally have a much less stressful life than if you’d tried to build services from the get go.

This entry was posted in How do I computer? on by .

# A personal history of interest in programming

Apparently some guy steeped in silicon valley startup culture has made some ignorant comment about how everyone who is good at programming started from a young age. This is my surprised face.

If you don’t know who I’m talking about, don’t worry about it. He’s not important. This is a pretty common narrative though, and I thought it might be worth holding up my personal journey to being interested in programming as a piece of anecdata against it.

I’m basically the kind of obsessive programmer this criteria is supposed to select for. I’ve spent my Christmas holidays working on persistent functional data-structures in C, I spend a huge amount of time and thought on the practice and theory. Also I’ve got all the privilege such criteria implicitly select for – I’m white, cis male, and painfully middle class.

You don’t need any of these things to be a good programmer. I mean, you sure as hell don’t need the pile o’ privilege I’ve got, but you also don’t need the obsession. It can be helpful, but it can also be harmful – you put a group of obsessive people together and often what you end up with is hilarious tunnel vision which loses sight of the real problem. I mention these things not because I think they’re important, but to demonstrate that even if you buy into the dominant narrative I am a counter-example. I look and act like a “real” programmer.

Want to know when I got started programming? Well, that’s hard to say exactly. I did little bits and pieces of it as a kid (more on that later), but I wasn’t really very interested in it. More on this later.

Want to know when I got interested in programming? That’s much easier. 2007. I was 23, about 10 years older than I’m “supposed” to have been interested in programming in order to be good at it.

How did this happen?

Well I’ve always liked computers. My family have had computers since quite a young age (see “privilege” for details) – I remember having a windows 3.1 computer before I moved to the UK, so looking at the release date I must have had one almost as soon as it came out. Before that we had a DOS computer.

I really enjoyed these, but not really for programming. I played games and used the word processor. We had some games written in qbasic and I vaguely poked at the code to see if I could make changes but I didn’t really know what I was doing and it was pretty much just editing at random and seeing what happened. I got bored quickly and didn’t pursue it further.

Years later at school we did some logo. It sure was fun drawing pretty pictures on the screen, and the little turtle was adorable.

Later yet we got the internet. This was pretty amazing! You could talk to people! Who were far away! And write stuff and people would read it! Also there were multi-player games, those were pretty cool. I spent a lot of time on various MUDs. I thought about learning to program so I could make a MUD or to build add-ons to one. I think I briefly even started contributing to a friend’s nascent MUD, but it all seemed like hard work and again I got bored pretty quickly.

Eventually it came time to choose my A-levels. I thought about doing a computing A-level, but my chemistry teacher persuaded me that I could learn to computer at any time (he was right) and that I should do chemistry instead (oh ye gods was he wrong). So I managed to go another couple years without ever learning to program.

At some point during this time I created a website. That isn’t to say I learned HTML and CSS mind you. I used some sort of WYSIWYG website creation tool from Mozilla and uploaded it to my btinternet account. I can’t really remember what I put there – I know I published some really terrible transhumanist fiction on it, but there were a bunch of other things too. I do remember the look though: White text on some sort of horrendous patterned background. Classy web design it was not.

At some point our btinternet account got cancelled and I lost the website (no backups of course). When I looked a few years ago bits of it were still available through the internet archive, but I honestly don’t remember the URL at this point.

I got to university and somehow found myself in a maths degree (which is another story). I now had my own computer which my parents had bought me (*cough* privilege *cough*)!

It had windows ME on it. :-(

After much frustration with ME, my friend Tariq Khokhar persuaded me to give this Linux thing a try in preference to, erm, “acquiring” a copy of windows 2000 and using that instead.

This worked out pretty well. I’m not going to say it was a flawless experience, but as a cocky teenager I was totally up for doing low-effort things that made me feel much cleverer than I actually was, and running this frustrating OS on the desktop certainly achieved that. This was back in 2001. 12 years later, I’m still being frustrated by this decision but find all the other options equally frustrating.

Then, finally, I was forced to actually learn to program.

Cambridge has (had?) as part of its maths degree a thing called “CATAM”. What this was was basically they gave you  your choice of three problems to do and you had to go learn to computer and then do them.  I don’t actually remember what my CATAM projects were – I know the first one was something to do with matrix maths, but I’m not really sure what.

You could do them in whatever computer you liked – they vaguely encouraged C (sadists), but they didn’t really mind if you did it in something else – I know at least one person did it in Excel.

I briefly looked into programming in C, but Tariq persuaded me that I would much rather do it in ML, which is what the computer scientists were learning for their intro to CS course. I went along to a few of their lectures, skimmed some of the notes, and bought a copy of ML for the working programmer and learned enough from it to solve the CATAM problems.

To be clear: At this point programming to me meant that I wrote up some code in a text editor (gedit I think. I don’t remember when I started using vim) and then cut and pasted it into the REPL (I think I was using Moscow ML because it was the one the course recommended), at which point I would copy the answer out of the repl and put it in my coursework.

That CATAM project finished, I did fine, and I immediately stopped programming again. I mean why would I keep it up? As far as I was concerned this was just a thing I learned to do to solve some maths problems. None of the maths problems I had to solve needed programming, so I wasn’t going to be programming.

I picked it up again for my second CATAM project the next year, but I didn’t really learn anything new about programming, I just did the same thing I did last time: Write in a text editor, copy and paste into the REPL. I might have switched to vim by then, but probably not.

At some point during this period I made a second website. My friend Michael, who I knew from IRC, persuaded me to do it “properly” this time. I learned what HTML was, copied some CSS from someone’s free designs elsewhere on the internet, and wrote about 5 lines of PHP (which I probably copied from somewhere) to build a hilarious navigation by get variables so I didn’t have to write the same header and footer everywhere and could use includes. I was supposed to use mod rewrite or something to get proper URLs but I never really bothered. Given my experiences with mod rewrite since, it’s possible I tried to make it work and couldn’t. I uploaded this to my account on the student computing society computers. At this point I’d learned enough about the command line from my forced immersion in it that I was able to figure out SSH and stuff, but I don’t think I did much more with it than moving some files around on the server.

Then I got to the end of my degree and suddenly didn’t know what to do with my life.

Everyone tells you that a maths degree is super desirable and is great for getting jobs. This is a lie. If you have other useful skills then people get very excited about the fact that you’re also a mathematician. If you don’t, no-one really cares.

Except banks. Finance was very keen to hire mathematicians. I, on the other hand, was very keen not to work in finance (because dad was a banker and it didn’t look like much fun. My political objections came later).

I shopped around Cambridge for a job for about 6 months (moving back in with my parents about halfway through that) but found that for some reason Cambridge had quite a surplus of overqualified people with no useful skills and I was one of them. Finding a job didn’t go so well.

At some point a friend mentioned that his company were hiring for developers and didn’t care if you actually knew what you were doing because they were pretty sure they could teach you. They were London based, and I wasn’t very keen on London, and I wasn’t really sure I wanted to be a developer, but I’d been job hunting for 6 months and it’d got pretty depressing so I figured it couldn’t hurt to talk to them. They were going to be at a careers fair in the computer science department, so I went along to that with a copy of my hilariously empty CV to talk to them.

Nothing came out of talking to that company, but while I was there I figured I might as well talk to a bunch of the other people there. I ended up talking to two other companies, both London based and hiring for developers but happy to teach people, so I interviewed with them.

Apparently I interviewed pretty well – the ML I’d learned and my maths degree were enough that although I didn’t really know what a binary search was I was able to figure it out with a little prompting, and I somehow managed to muddle my way through the other problems. One of them mostly tested problem solving, which I was good at, and the other had some toy programming problems, which I managed to do OK on.

One of them offered me a job off the back of this. The other one was a little more hesitant about hiring someone who basically couldn’t program, so asked me to write them a program to prove that I could. I went with the company that didn’t require me to demonstrate that I knew what I was doing, mostly because I didn’t.

So, well, at that point I was moving to London to become a software developer. Woo? I wasn’t really that interested in programming, and I rather hated London, but hey it was a job.

I wasn’t very good at it at first, unsurprisingly.

I started doing some basic front-end stuff. I was thoroughly overconfident in my abilities with HTML and CSS off the back of the small amount I’d learned already. I also got tasked with setting up a server, because I knew Linux so I must know what I was doing, right? I had no idea what I was doing and wasn’t very good at asking for help and after about 4 days of butchering it we got a server that… probably did the right thing? While this was going on I was learning to write Java (which the project I was working on was written in). It seemed pretty easy – the syntax was weirdly cumbersome, and it seemed a bit limited in comparison to ML, but it wasn’t hard to get things done in it. I doubt I really knew what I was doing, but this didn’t seem to matter terribly. (Fragment of conversation I remember from this time period: “Hey, can I use recursion in Java?” “Yes, but you probably shouldn’t”).

But I stumbled my way through into competence, and then necessity forced me into actually figuring out what was going on and starting to think about how to do things better. At some point my friend Dave joined the company I was working at and we bonded over talking about work and computer science while waiting for our 10 minute builds to run so we could observe the tiny change to the HTML we’d just made (sadly I’m not exaggerating that).

At some point the persistent nagging feeling that things could be done better than this mixed with talking about CS with Dave and I grudgingly started to realise that this programming thing was actually quite interesting.

Concurrently with this, some friends on IRC had persuaded me to learn Haskell. I forget what their reasoning was, but I mostly went along with it. I already knew ML, which is basically a simpler and less puritanical version of Haskell (the module system is better, but remember that my experience of ML was from copying and pasting stuff into the REPL. I didn’t even know what a module was), so learning Haskell was pretty straightforward.

At this point I had basically two bodies of knowledge that were failing to connect – I had these cool languages I did maths programming in and this Java thing which I used to get stuff done. I started hunting around in my spare time for a middle-ground, and after a false start with Nice ended up in Scala.

But work was at this point pretty frustrating. For a variety of reasons I wasn’t really enjoying it much, so I decided to look for a new job. I ended up at a company called Trampoline Systems, where I got to work with good people on interesting problems. The company eventually went a bit wrong and most of us left (it’s still around, albeit in a very different form, but only one person from when I was there is still around), but by then it was enough: I’d become quite interested in the practice of programming, and I’d discovered I could find interesting jobs doing it.

The rest is, as they say, history. Despite my long history of not getting around to learning to program, I’d become a software developer anyway. I discovered I was quite good at it and it was quite interesting, so I’ve stuck around since.

Would things have worked out better if I’d learned earlier? I don’t know. I feel like the practice of solo programming is very different from the practice of programming in a team. What helped me was that I’d already developed a lot of habits of thought and problem solving before I ever really brought them to bear on the practice of software development. I’m sure if I’d learned to program earlier it wouldn’t have hurt, but I think I’d probably have a very different perspective on things today, and it’s unclear if it would be a better one.

I think the thing to bear in mind is that software development isn’t really about programming. It’s about solving problems with computers. Programming is generally the tool you use to achieve that goal, and for some tasks you need to be better at it than others, but ultimately what you need is to be good at problem solving (not silly abstract logic problems, but actual problems that people have that computers can help solve). This is a skill you can develop doing almost anything, and if you’re already good at it then you can probably become a good software developer remarkably quickly even if you’re as disinterested in programming as I used to be.

This entry was posted in How do I computer?, life on by .

# An amusing problem to solve

I was thinking about coding questions, and how you would write a coding question that tested peoples’ ability to explore algorithms in areas they weren’t likely to be that familiar with without necessarily being able to come up with optimal solutions.

I mulled over this a bit and came up with one which is entirely too evil to use. However I rather feel like I’m going to have to try to implement it anyway because it’s going to bug me. I figured I would inflict this on the rest of you, because misery loves company.

The problem is this:

You are given a non-negative symmetric $$n \times n$$ matrix $$A$$ with $$A_{ii} = 0$$, with real numbers represented as double precision. This is your potential matrix.

Find an assignment of coordinates $$x_1, \ldots, x_n \in [0, 1]$$ that has a low value for the energy function $$E = \sum\limits_{1 \leq i < j \leq n} \frac{A_{ij}}{|x_i – x_j|}$$. You are not required to minimize it, but if you can that would be lovely.

Some initial thoughts on how one might solve it:

1. Any optimal solution for a non-zero $$A$$ assigns at least one point to 0 and at least one point to 1.
2. For any given starting position there is a relatively straightforward gradient descent algorithm to find a local minimum
3. If $$A_{ij} > 0$$ for all $$i \neq j$$ then there are at least $$n!$$ local minima.
4. If you’ve placed $$k$$ items and you want to place a $$k + 1th$$, as long as it has a non-zero coordinate in $$A$$ with at least one of the items you’ve placed so far there is a unique place to put the k+1th item (though finding this is O(k)). This probably makes a greedy algorithm quite productive for smallish $$n$$.
5. If the graph with edges $$\{ \{i, j\} : A_{ij} \neq 0\}$$  is not connected you can run the problem separately on each component and merge the results, because they don’t affect each-other.
6. Some form of e.g. simulated annealing with the mutation operator being to swap pairs of elements then run a local optimisation would probably be quite productive.

As an additional comment: Trying to do combinatorial optimisation on permutations of indices is an exercise in frustration. Believe me, I know. This problem may prove more annoying than it’s worth.

This entry was posted in How do I computer?, Numbers are hard on by .

# The false proxies of mirror images

A while back Tim Chevalier wrote a post called “Hiring based on hobbies: effective or exclusive?” that I seem to feel the need to keep signal boosting. It’s about the problems with using hobbies as a factor when hiring people. I recommend reading the whole thing, but allow me to quote a relevant section:

[…] if you decide someone isn’t worth hiring because they don’t have “interesting” hobbies, what you’re really saying is that people who didn’t come from an affluent background aren’t interesting. That people with child care or home responsibilities aren’t interesting. That disabled people aren’t interesting. That people who have depression or anxiety aren’t interesting. That people who might spend their free time on political activism that they don’t feel comfortable mentioning to a stranger with power over them aren’t interesting. That people who, for another reason, just don’t have access to hacker spaces and don’t fit into hobbyist subcultures aren’t interesting.

Essentially the point is that hiring based on hobbies selects for people who are from a similar background to you.

You see, the problem is that the answer to the question of whether this is effective or exclusive is that it’s both. Hobbies are in many ways a really good signal of an active mind which will engage well with the job.

The problem is that they are also a signal that the person has the time and energy to pursue those hobbies.

Amongst people who have the lack of commitments and routinely have the mental energy for it, it may be that there is a very strong correlation between hobbies and competence (I’d actually place bets that it’s significantly less strong than we like to believe it is, but I have no data so I’m not going to speculate too wildly on that front. Lets assume for the sake of the argument that popular wisdom is correct here).

The problem is that hobbies are a form of false proxy. We’re unable to perform the test that would really allow us to determine someone’s competence (that is to say: Hire them and work closely with them for a couple years in a variety of different scenarios), so instead we pick something which we can easily measure and appears to be a good signal for it.

And how did we learn that it was a good signal?

Well, by looking around us. Look at the people we know that are good. If we’re feeling very virtuous we can look at the people we know are bad and actually compare differences rather than just falling into the “good people do X. Therefore X is good” fallacy.

The problem is that you’re looking at a very small population, and when you do this you’re necessarily massively over-fitting for the data. When you create tests based on your observation of your current in-group you end up creating tests that work very well for people who look like the data you’re modelling and at best perform more noisily for people outside it, but more likely actively penalise them.

Why? Because this is where privilege comes in. (Note: Privilege in this context is a technical term. If you feel the need to comment something along the lines of “How dare you call me privileged? I worked hard to get where I am!” etc just… don’t. Please).

The problem is that the advantages you have are mostly ones you don’t see. Because most of society’s advantages don’t come in terms of people giving you a leg up (certainly some do, but we tend to be more aware of those), they come in the form of things you don’t have to worry about. You may not have to worry about the constraints of chronic illness, of dependants, of currently being in a position where money is tight. It’s hard to notice the absence of something you’ve never experienced, and consequently you often don’t notice these group advantages. This is especially true because even if some people experience it at the individual level, as a group we’re a fairly privileged lot and so most of our group behaviours will lack these constraints.

There’s another way these false proxies can creep in. There have been a bunch of discussions about the myth of the 10x engineer recently. I also had some interesting conversations about various biotruthy beliefs about programming on Twitter (mostly with tef and janepipistrelle I think). Underlying both of these is a common question: What do we mean by programming ability?

Well, it’s obvious of course. Programming ability is being good at the things that I’m good at. Those things I’m not good at? Yeah I suppose some of them are kinda important, but not nearly as much.

This is a less egotistical viewpoint than you might think it is. Certainly some people believe this because they’re full of themselves, but it’s entirely possible to accidentally finding yourself believing this with the best intentions in the world.

How?

Well, what are the things you work at getting better at?

Right. The things you think are important. So it’s not just that people think things they are good at are important. It’s also that people work to get good at the things they think are important.

So what do you do when you want to decide how well someone will contribute to a team? You look at how good they are of course.

That is… how much they’re good at the things you’re also good at.

Or how much they look like you.

Oh, you’ll still choose people with complementary skills. I don’t think many of us fall into this trap so overtly as to only hire people with the exact same skill set as us. But step up a level, there are the meta qualities. Things like mathematical ability, passion about technology, reliability, being able to think through problems quickly, ability to communicate well and confidently, etc. Certainly these are some of the things I value. Coincidentally they’re things I also think I do quite well in. Funny that, huh?

These don’t necessarily contribute to a group diversity problem. Some of them do – it’s easier to talk confidently when you’re not socially punished for doing so, it’s easier to be passionate when you’ve got the energy to spare – but even when there’s little between-group variation in them (e.g. mathematical ability) they contribute to a personality monoculture. We don’t all look the same, we also all think rather more similarly than sheer chance would suggest.

Ultimately what we’re selecting for when hiring isn’t actually about the individual we’re hiring. What we really want to know is if the addition of this person to the team will make us better in ways we need to be better. This doesn’t necessarily mean they have “cultural fit” or that their abilities align well with the team – it might mean that they have a slow, methodical approach that will provide the necessary damper on the team’s mad rush to ship regardless of how broken. It might mean that they provide a calm, mediating voice. It might simply mean that they’re good at programming in ways that we wouldn’t expect because they’re different from our own. The point is that we don’t actually just need to hire people who are like us, we should hire people who augment us. People from across the spectrum who will make us better in a myriad different ways.

But we can’t test for that, because we don’t know how. So instead we invent these tests which we think provide good proxies for it.

But many of these tests are false proxies which are really testing how similar they are to us.

And then we act surprised when our whole team looks like us, and we claim that well it’s just what the tests show all the best candidates looked like and we have to accept the reality.

What a lovely meritocracy we work in.

This entry was posted in Feminism, Hiring, How do I computer? on by .