Category Archives: programming

On Haskell, Ruby, and Cards Against Humanity

I really hate Cards Against Humanity.

I don’t particularly hate it for the obvious reasons of middle class white kids sitting around a table giggling at how transgressive they are being with their jokes about race. I don’t approve, and I would probably avoid it for that reason alone, but it’s frankly so banal in this regard that it probably wouldn’t be enough to raise my ire above lukewarm. What I really hate about it is that it’s a game about convincing yourself that you’re much funnier than you actually are.

Cards Against Humanity is funny, but it’s not that funny. It relies on shock value to inflate its humour, and by making a game of that draws you in and makes you complicit. Your joke about Vladimir Putin and Altar Boys? Not actually very funny, but by giving you permission to be shocking, Cards Against Humanity lets you feel like the world’s best comedian for an hour.

There’s nothing intrinsically wrong with enjoying pretending you’re funnier than you actually are. I’ve intensely enjoyed games where I got to spend an hour pretending I was an elf, and no harm was done to any parties involved (except for some imaginary orcs who found themselves full of imaginary lightning bolts). The difference is, I don’t walk away from the latter with any doubt in my head about whether I’m an elf (I’m not, in case you were wondering). People walk away from a game of Cards Against Humanity thinking that they’re hilarious.

Is this objectively a problem? I’m not sure I’d go that far. It would require too much of a digression into systems of ethics and theories of value.

Personally though, I hate it. I hate it a lot. Humour and accurate self-knowledge are both things I value, and the intersection is especially important because inaccurate self-knowledge about something you want to become better at is a great way of becoming worse at it instead. If you learn the lesson that being crass is a great way to be hilarious, you won’t become better at humour, you’ll just become better at being crass.

Which brings me to Ruby and Haskell.

These two programming languages are about as far away from each other on the field of programming languages as you can get. But one thing they have in common, other than garbage collection and list syntax, is their similarity to Cards Against Humanity.

Oh, they generally don’t give you the same sense of being way funnier than you actually are (_why’s poignant guide and learn you a haskell aside), but they do have community norms that have similar effects of inflating your sense of how good you are at something important.

A thing I value probably at least as much as humour is good API design. It’s an essential weapon in the war on shitty software. It’s almost completely absent in Ruby. There are some exceptions (Sequel and Sinatra for example), but by and large Ruby APIs seem almost actively designed to promote bad software.

Ruby is full of people who think they’re doing good API design, but they’re not. The problem is that in Ruby everyone is designing “DSLs”. The focus is on making the examples look pretty. I’m all for making examples look pretty, but when you focus on that to the exclusion of the underlying semantics, what you get is the Cards Against Humanity effect all over again. You’re not doing good API design – you’re doing something that has the appearance of good API design, but is in reality horrifically non-modular and will break in catastrophic ways the first time something unexpected happens, but the lovely syntax lets you pat yourself on the back and tell yourself what a good API designer you are.

And then there’s Haskell.

The way this problem manifests in Haskell is in how incredibly clever it makes you feel to get something done in it. Haskell is different enough from most languages that everything feels like an achievement when writing it. “Look, I used a monad! And defined my own type class for custom folding of data! Isn’t that amazing?“. “What does it do?” “It’s a CRUD app”.

To a degree I’m sure this passes once you’re more familiar with the language and are using it for real things (although for some people familiarity just spurs them on to even greater heights), but a large enough subset of the Haskell community never reaches that stage and just continues to delight in how clever they are for writing Haskell. In the same way that Cards Against Humanity gives you a false sense of being funny, and Ruby gives you a false sense of designing good APIs, Haskell gives you a false sense of solving important problems because it gives you all these exciting problems to solve that have nothing to do with what you’re actually trying to achieve.

I very much do not hate Haskell or Ruby in the same way that I hate Cards Against Humanity, because unlike Cards Against Humanity these features are not the point of the languages, and the languages have myriad virtues to counteract them.

But I do find a fairly large subset of their communities really annoying in how much they exemplify this behaviour. They’re too caught up in congratulating themselves on how good they are at something to notice that they’re terrible at it, and it makes for some extremely frustrating interactions.

This entry was posted in programming on by .

Tests are a license to delete

I’ve spent the majority of my career working on systems that can loosely be described as “Take any instance of this poorly specified and extremely messy type of data found in the wild and transform it into something structured enough for us to use”.

If you’ve never worked on such a system, yes they’re about as painful as you might imagine. Probably a bit more. If you have worked on such a system you’re probably wincing in sympathy right about now.

One of the defining characteristics of such systems is that they’re full of kludges. You end up with lots of code with comments like “If the audio track of this video is broken in this particular way, strip it out, pass it to this external program to fix it, and then replace it in the video with the fixed version” or “our NLP code doesn’t correctly handle wikipedia titles of this particular form, so first apply this regex which will normalize it down to something we can cope with” (Both of these are “inspired by” real examples rather than being direct instances of this sort of thing).

This isn’t surprising. Data found in the wild is messy, and your code tends to become correspondingly messy to deal with all its edge cases. However kludges tend to accumulate over time, making the code base harder and harder to work with, even if you’re familiar with it.

It has however historically made me very unhappy. I used to think this was because I hate messy code.

Fast-forward to Hypothesis however. The internals are full of kludges. They’re generally hidden behind relatively clean APIs and abstraction layers, but there’s a whole bunch of weird heuristics with arbitrary magic numbers in them and horrendous workarounds for obscure bugs in other peoples’ software (Edit: This one is now deleted! Thanks to Marius Gedminas for telling me about a better way of doing it).

I’m totally fine with this.

Some of this is doubtless because I wrote all these kludges, but it’s not like I didn’t write a lot of the kludges in the previous system! I have many failings and many virtues as a developer, but an inability to write terrible code is emphatically not either of them.

The real reason why I’m totally fine with these kludges is that I know how to delete them: Every single one of these kludges was introduced to make a test pass. Obviously the weird workarounds for bugs all have tests (what do you take me for?), but all the kludges for simplification or generation have tests too. There are tests for quality of minimized examples and tests for the probability of various events occurring. Tuning these are the two major sources of kludges.

And I’m pretty sure that this is what makes the difference: The problem with the previous kludges is that they could never go away. A lot of these systems were fairly severely under-tested – sometimes for good reasons (we didn’t have any files which were less that 5TB that could reproduce a problem), some for code quality reasons (our pipeline was impossible to detangle), sometimes just as a general reflection of the culture of the company towards testing (are you saying we write bugs??).

This meant that the only arbiter for whether you could remove a lot of those kludges was “does it make things break/worse on the production system?”, and this meant that it was always vastly easier to leave the kludges in than it was to remove them.

With Hypothesis, and with other well tested systems, the answer to “Can I replace this terrible code with this better code?” is always “Sure, if the tests pass”, and that’s immensely liberating. A kludge is no longer a thing you’re stuck with, it’s code that you can make go away if it causes you problems and you come up with a better solution.

I’m sure there will always be kludges in Hypothesis, and I’m sure that many of the existing kludges will stay in it for years (I basically don’t see myself stopping supporting versions of Python with that importlib bug any time in the near future), but the knowledge that every individual kludge can be removed if I need to is very liberating, and it takes away a lot of the things about them that previously made me unhappy.

This entry was posted in Hypothesis, programming on by .

Surprise! Feminism.

So yesterday’s article on it being OK to write shitty open source has had thirty thousand views. Most of these came from it hanging out at the top of r/programming for nearly 24 hours.

Which… wow. Another four thousand (which it will probably manage today) and it will be twice as popular as my next most popular articles, which have respectively had a nation wide referendum and years of being a wikipedia link from a popular article to drive up their traffic.

Of those thirty thousand, I’d bet you decent money that twenty thousand minimum thought it was purely about software aphorisms like “release early, release often” and “worse is better” (it’s not about worse is better. Stop trying to make it about worse is better) and didn’t notice that it was not even subtly coded feminism.

“Wait, what? How was it feminism? You didn’t even mention gender!” says my old friend, the suspiciously convenient anonymous voice, currently acting as an expy for a whole bunch of dudes who would have been happy to do the job.

You are correct, suspiciously convenient anonymous voice. I did not mention gender, and in a perfectly egalitarian feminist utopia the piece could have stood on its own without any feminist undertones to speak of.

But we don’t live in that utopia.

As I mentioned, there are two things that you need in order to produce quality software for free:

  1. Time
  2. Money

Anyone want to take a guess what women on average have much less of? Anyone? Anyone?

Did you guess “Both of these things”? Well done!

In our society, a far greater burden of free labour is placed on women (Terminology note: I think everything I say about women in this piece also applies to people who are not women but are perceived as such). They are more likely to be expected to do child care, more likely to be expected to do house work, more likely to be expected to provide free emotional labour in the form of support and favours to others.

And, as we established, labour takes time. So all of this extra labour women are being expected to do cuts into the time they have to do other things, like open source.

But wait, there’s more! Women are also paid less (exact reports for how much differ, so I’m not going to mention a figure for people to derail the point by quibbling about). The freedom to take time off to work on a thing is a lot easier to have if you’re actually paid enough to be able to afford it.

It’s OK though. In order to make up for the lack of these two things, women do have a few things they have more of to offset it.

  1. Dudes who think they are entitled to their time for free
  2. Standards of quality they are expected to meet

…wait, that doesn’t make it better at all, does it?

The sad fact is that women with a public presence get given a much harder time than men do, and this transfers entirely to the open source world: People are extremely ready to police the quality of your work already. Being a woman turns this up to 11.

So with less time and less money to achieve quality in and higher standards they are likely to be expected to adhere to, is it any surprise that the percentage of women in open source is a lot lower than it is in tech overall (where it’s already bad)?

Open source culture in general, and this problem in particular, are not exclusively feminist issues. You absolutely can and do experience these problems as a man, and these problems could easily still exist in a fully gender equal or genderless society, but any understanding of how these problems manifest in and interact with the society we actually have will be incomplete without the structural analysis of privilege and its interaction with gender that feminism brings to the table.

If you want to read more about this subject, I recommend this great piece by Ashe Dryden: The ethics of unpaid labour and the OSS community. A lot of my thoughts and opinions around this were informed by it.

 

This entry was posted in Feminism, programming on by .

Interviewing: Test, don’t sample

Do you ask for a code sample when interviewing someone?

Don’t. It’s a terrible idea. It creates stress and doesn’t give you any useful answers.

Seeing code they’ve written is obviously good and useful, but the way to do this is not to ask for a sample, it’s to set them a small task (something that shouldn’t take more than an hour or two) and ask them to code a solution to it.

Sure, for some people it will take more time, but for most this will be less stressful and for you it will be infinitely more useful.

Why it is less stressful:

  1. It puts everyone on an equal footing. Some people can’t give you a recent coding sample because everything they’ve written recently is under NDA.
  2. They are not trying to guess what you’re looking for because you’ve said what you are looking for. They don’t need to guess whether you’d prefer something cute and clever or boring but well tested. They don’t need to spend ages sorting through a bunch of code they’ve written trying to figure out what will best fit your subjective requirements.

Why it is better for you:

  1. Less stressed candidates produce more representative answers.
  2. You have more control over what you are testing for, and can refine this over time.
  3. Any question where you can’t compare the answer between candidates is a waste of your time and theirs because it’s so subjective and poorly calibrated that you might as well just toss a coin. You can compare coding tests, you can’t compare coding samples.

Code samples: Bad for the candidate, and worse for you. Just say no.

This entry was posted in Hiring, programming on by .

It’s OK for your open source library to be a bit shitty

(Content note: I normally try to keep my natural level of profanity slightly under control on this blog. I won’t be doing that in this post)

The major reason I wrote Hypothesis is to destroy shitty software. Everything is terrible, and I want it to be less terrible.

But that random code you threw together as a hack, stopped when it did what you needed to do, threw it up on pypi and then neglected it?

That’s totally OK. Thanks for writing it. The world is slightly better for your having done so, and there is no burden of expectation on you to “do a better job”. It’s not a job after all, is it? We’re not paying you to do it.

And this is what it ultimately comes down to.

I flatter myself that one of the things that I can legitimately claim about Hypothesis is that it is high quality. So far the worst bug that anyone has reported in the 1.0 release is that when given a wrong argument, Hypothesis throws an exception with the right type from the right place with the wrong error message. Hypothesis works on OSX, Linux, Windows, probably *BSD (I heard about some packaging issues with sqlite, but nothing since, so I think it worked once those were resolved), Python 2.7, 3.2-3.4, CPython, Pypy… It’s got 100% branch coverage, is documented, etc. Basically as far as quality goes Hypothesis does almost everything right.

So I’ve proved it can be done and that you should do it too, right?

Nah, that’s bullshit. I’m here to tell you as someone who has done the work of producing quality software for free, you don’t have to and you shouldn’t feel bad about not doing so.

Want to know how I did everything right in Hypothesis? I mean, obsessive attention to detail and high standards helped, and may even have been essential, but they weren’t even close to sufficient. Really there are two things that were the key ingredients to my making Hypothesis the quality piece of software it is today:

  1. Time
  2. Money

I’ve put somewhere in the region of 800 hours of work into Hypothesis this year, entirely for free. That’s what it took to get to this level of quality.

And I could only do this because I had the time and money to do so. I had the time to do so because I was being obsessive, had no dependents, and didn’t have a job. I could only not have a job because of the money. I only had the money because I spent the latter half of last year with double the salary I was used to, half the living expenses I was used to, and too borderline depressed to spend it on anything interesting.

These are not reasonable requirements.

Could I have done Hypothesis in less than 800 hours? Probably. I doubt I could have done it in less than 400 though, and I would be foolish to expect I could do any project in the smallest amount of time I could feasibly do it in.

Hypothesis is a large and complicated project though (if it doesn’t look like one, that’s because a lot of those 800 hours were spent on making it easy to use). Most projects are probably an order of magnitude simpler.

i.e. only 80 hours.

i.e. only you having to take two weeks off work, working literally for free, in order to produce quality software.

i.e. nearly half your holiday allowance if you live in a civilized country, or possibly more than your holiday allowance if you live in the US.

This is still not a reasonable requirement.

Can you produce quality software in less time than that, working only in your free time? I doubt it. Free time is inherently less productive. You’re tired and it’s fragmented. You spend an hour one evening trying to figure out why your windows builds are now failing because a new version of pip is released and after that I guess you could put another half hour in but your heart isn’t really in it and you’d basically not have the time to get properly stuck into it. The bar for quality is high, and the obstacles to it are higher, and there’s really nothing you can do to fix that other than to put into the time.

Don’t get me wrong. If you can put in the time I will be incredibly grateful to you. I just don’t think you should feel bad if you don’t.

There is no obligation to free labour. Every hour you put in working on your project for free is a gift to the world. If the world comes back to you and says “You are a bad person for not supporting this thing I need you to support” then fuck them. If they want that they should pay you for it, or do it themselves.

(Edit to note: This isn’t of course to say that you shouldn’t ask for features on open source projects. Only that you are not entitled to them. If you politely ask, that’s fine. If the author then says “Sorry, no, I don’t have the time / am not interested / literally any other reason at all”, that’s fine too)

Note: If you liked this piece, there is a follow up you may wish to read.

This entry was posted in programming on by .