Category Archives: programming

Hypothesis 1.8.0 is out

This release is mostly focused on internal refactoring, but has some nice polishing and a few bug fixes.

New features:

  • Much more sensible reprs for strategies, especially ones that come from hypothesis.strategies. These should now have as reprs python code that would produce the same strategy.
  • lists() accepts a unique_by argument which forces the generated lists to be only contain elements unique according to some function key (which must return a hashable value).
  • Better error messages from flaky tests to help you debug things.

Mostly invisible implementation details that may result in finding new bugs in your code:

  • Sets and dictionary generation should now produce a better range of results.
  • floats with bounds now focus more on ‘critical values’, trying to produce values at edge cases.
  • flatmap should now have better simplification for complicated cases, as well as generally being (I hope) more reliable.

Bug fixes:

  • You could not previously use assume() if you were using the forking executor (because the relevant exception wasn’t pickleable).
This entry was posted in Hypothesis, programming, Python on by .

Some stuff on example simplification in Hypothesis

One of the slightly thankless tasks I engage in in Hypothesis (subtype: People only notice if I don’t do my job well) is example simplification. I spend a lot of time on example simplification – far more than I think anyone else who writes a Quickcheck bothers with.

What is example simplification? Well, when Hypothesis tells you that an example that breaks your test is [1, 2, 3] then it almost certainly did not start out with that. It probably started out with a list about 20 elements long with numbers like 349781419708474 in it, and then prune it down to [1, 2, 3].

Example simplification is important because random data is very messy but there is usually a simple example that demonstrates the bug, and the latter is helpful but the former rarely is.

Simplification is of course not new to Hypothesis – it’s been in Quickcheck for 15 years – but Hypothesis’s example simplification is atypically sophisticated.

So, why so sophisticated? Why is the normal Quickcheck approach not good enough?

Well the official reasons are as follows:

  1. Hypothesis generates much more complex examples than Quickcheck. e.g. Quickcheck will generate a long list of small integers or a short list of large integers, but will never generate a long list of long integers.
  2. Tests in Hypothesis tend to be slower, due to a mix of Python performance and Python idioms.

The unofficial reasons are probably actually closer to being the true reasons:

  1. Including bad examples in documentation is embarrassing
  2. My test suites do a lot of example simplification so faster example simplification gives faster test suites gives a less stressed David
  3. I find combinatorial optimisation fun.

At any rate, the result is that I’ve spent a lot of time on this, so it seemed worth sharing some of the knowledge. I’ve done this in two places. The first is that I gave a talk at the London Big O user group. The slides are also available separately.

I was going to transcribe it, but actually I didn’t think it was a very good talk (I prepared it in a bit of a rush), so instead I’ve built an ipython notebook working through the concepts. This actually goes slightly further than the talk, as I figured out some things at the end that I need to merge back into Hypothesis proper.

This entry was posted in Hypothesis, programming, Python on by .

On Haskell, Ruby, and Cards Against Humanity

I really hate Cards Against Humanity.

I don’t particularly hate it for the obvious reasons of middle class white kids sitting around a table giggling at how transgressive they are being with their jokes about race. I don’t approve, and I would probably avoid it for that reason alone, but it’s frankly so banal in this regard that it probably wouldn’t be enough to raise my ire above lukewarm. What I really hate about it is that it’s a game about convincing yourself that you’re much funnier than you actually are.

Cards Against Humanity is funny, but it’s not that funny. It relies on shock value to inflate its humour, and by making a game of that draws you in and makes you complicit. Your joke about Vladimir Putin and Altar Boys? Not actually very funny, but by giving you permission to be shocking, Cards Against Humanity lets you feel like the world’s best comedian for an hour.

There’s nothing intrinsically wrong with enjoying pretending you’re funnier than you actually are. I’ve intensely enjoyed games where I got to spend an hour pretending I was an elf, and no harm was done to any parties involved (except for some imaginary orcs who found themselves full of imaginary lightning bolts). The difference is, I don’t walk away from the latter with any doubt in my head about whether I’m an elf (I’m not, in case you were wondering). People walk away from a game of Cards Against Humanity thinking that they’re hilarious.

Is this objectively a problem? I’m not sure I’d go that far. It would require too much of a digression into systems of ethics and theories of value.

Personally though, I hate it. I hate it a lot. Humour and accurate self-knowledge are both things I value, and the intersection is especially important because inaccurate self-knowledge about something you want to become better at is a great way of becoming worse at it instead. If you learn the lesson that being crass is a great way to be hilarious, you won’t become better at humour, you’ll just become better at being crass.

Which brings me to Ruby and Haskell.

These two programming languages are about as far away from each other on the field of programming languages as you can get. But one thing they have in common, other than garbage collection and list syntax, is their similarity to Cards Against Humanity.

Oh, they generally don’t give you the same sense of being way funnier than you actually are (_why’s poignant guide and learn you a haskell aside), but they do have community norms that have similar effects of inflating your sense of how good you are at something important.

A thing I value probably at least as much as humour is good API design. It’s an essential weapon in the war on shitty software. It’s almost completely absent in Ruby. There are some exceptions (Sequel and Sinatra for example), but by and large Ruby APIs seem almost actively designed to promote bad software.

Ruby is full of people who think they’re doing good API design, but they’re not. The problem is that in Ruby everyone is designing “DSLs”. The focus is on making the examples look pretty. I’m all for making examples look pretty, but when you focus on that to the exclusion of the underlying semantics, what you get is the Cards Against Humanity effect all over again. You’re not doing good API design – you’re doing something that has the appearance of good API design, but is in reality horrifically non-modular and will break in catastrophic ways the first time something unexpected happens, but the lovely syntax lets you pat yourself on the back and tell yourself what a good API designer you are.

And then there’s Haskell.

The way this problem manifests in Haskell is in how incredibly clever it makes you feel to get something done in it. Haskell is different enough from most languages that everything feels like an achievement when writing it. “Look, I used a monad! And defined my own type class for custom folding of data! Isn’t that amazing?“. “What does it do?” “It’s a CRUD app”.

To a degree I’m sure this passes once you’re more familiar with the language and are using it for real things (although for some people familiarity just spurs them on to even greater heights), but a large enough subset of the Haskell community never reaches that stage and just continues to delight in how clever they are for writing Haskell. In the same way that Cards Against Humanity gives you a false sense of being funny, and Ruby gives you a false sense of designing good APIs, Haskell gives you a false sense of solving important problems because it gives you all these exciting problems to solve that have nothing to do with what you’re actually trying to achieve.

I very much do not hate Haskell or Ruby in the same way that I hate Cards Against Humanity, because unlike Cards Against Humanity these features are not the point of the languages, and the languages have myriad virtues to counteract them.

But I do find a fairly large subset of their communities really annoying in how much they exemplify this behaviour. They’re too caught up in congratulating themselves on how good they are at something to notice that they’re terrible at it, and it makes for some extremely frustrating interactions.

This entry was posted in programming on by .

Tests are a license to delete

I’ve spent the majority of my career working on systems that can loosely be described as “Take any instance of this poorly specified and extremely messy type of data found in the wild and transform it into something structured enough for us to use”.

If you’ve never worked on such a system, yes they’re about as painful as you might imagine. Probably a bit more. If you have worked on such a system you’re probably wincing in sympathy right about now.

One of the defining characteristics of such systems is that they’re full of kludges. You end up with lots of code with comments like “If the audio track of this video is broken in this particular way, strip it out, pass it to this external program to fix it, and then replace it in the video with the fixed version” or “our NLP code doesn’t correctly handle wikipedia titles of this particular form, so first apply this regex which will normalize it down to something we can cope with” (Both of these are “inspired by” real examples rather than being direct instances of this sort of thing).

This isn’t surprising. Data found in the wild is messy, and your code tends to become correspondingly messy to deal with all its edge cases. However kludges tend to accumulate over time, making the code base harder and harder to work with, even if you’re familiar with it.

It has however historically made me very unhappy. I used to think this was because I hate messy code.

Fast-forward to Hypothesis however. The internals are full of kludges. They’re generally hidden behind relatively clean APIs and abstraction layers, but there’s a whole bunch of weird heuristics with arbitrary magic numbers in them and horrendous workarounds for obscure bugs in other peoples’ software (Edit: This one is now deleted! Thanks to Marius Gedminas for telling me about a better way of doing it).

I’m totally fine with this.

Some of this is doubtless because I wrote all these kludges, but it’s not like I didn’t write a lot of the kludges in the previous system! I have many failings and many virtues as a developer, but an inability to write terrible code is emphatically not either of them.

The real reason why I’m totally fine with these kludges is that I know how to delete them: Every single one of these kludges was introduced to make a test pass. Obviously the weird workarounds for bugs all have tests (what do you take me for?), but all the kludges for simplification or generation have tests too. There are tests for quality of minimized examples and tests for the probability of various events occurring. Tuning these are the two major sources of kludges.

And I’m pretty sure that this is what makes the difference: The problem with the previous kludges is that they could never go away. A lot of these systems were fairly severely under-tested – sometimes for good reasons (we didn’t have any files which were less that 5TB that could reproduce a problem), some for code quality reasons (our pipeline was impossible to detangle), sometimes just as a general reflection of the culture of the company towards testing (are you saying we write bugs??).

This meant that the only arbiter for whether you could remove a lot of those kludges was “does it make things break/worse on the production system?”, and this meant that it was always vastly easier to leave the kludges in than it was to remove them.

With Hypothesis, and with other well tested systems, the answer to “Can I replace this terrible code with this better code?” is always “Sure, if the tests pass”, and that’s immensely liberating. A kludge is no longer a thing you’re stuck with, it’s code that you can make go away if it causes you problems and you come up with a better solution.

I’m sure there will always be kludges in Hypothesis, and I’m sure that many of the existing kludges will stay in it for years (I basically don’t see myself stopping supporting versions of Python with that importlib bug any time in the near future), but the knowledge that every individual kludge can be removed if I need to is very liberating, and it takes away a lot of the things about them that previously made me unhappy.

This entry was posted in Hypothesis, programming on by .

Surprise! Feminism.

So yesterday’s article on it being OK to write shitty open source has had thirty thousand views. Most of these came from it hanging out at the top of r/programming for nearly 24 hours.

Which… wow. Another four thousand (which it will probably manage today) and it will be twice as popular as my next most popular articles, which have respectively had a nation wide referendum and years of being a wikipedia link from a popular article to drive up their traffic.

Of those thirty thousand, I’d bet you decent money that twenty thousand minimum thought it was purely about software aphorisms like “release early, release often” and “worse is better” (it’s not about worse is better. Stop trying to make it about worse is better) and didn’t notice that it was not even subtly coded feminism.

“Wait, what? How was it feminism? You didn’t even mention gender!” says my old friend, the suspiciously convenient anonymous voice, currently acting as an expy for a whole bunch of dudes who would have been happy to do the job.

You are correct, suspiciously convenient anonymous voice. I did not mention gender, and in a perfectly egalitarian feminist utopia the piece could have stood on its own without any feminist undertones to speak of.

But we don’t live in that utopia.

As I mentioned, there are two things that you need in order to produce quality software for free:

  1. Time
  2. Money

Anyone want to take a guess what women on average have much less of? Anyone? Anyone?

Did you guess “Both of these things”? Well done!

In our society, a far greater burden of free labour is placed on women (Terminology note: I think everything I say about women in this piece also applies to people who are not women but are perceived as such). They are more likely to be expected to do child care, more likely to be expected to do house work, more likely to be expected to provide free emotional labour in the form of support and favours to others.

And, as we established, labour takes time. So all of this extra labour women are being expected to do cuts into the time they have to do other things, like open source.

But wait, there’s more! Women are also paid less (exact reports for how much differ, so I’m not going to mention a figure for people to derail the point by quibbling about). The freedom to take time off to work on a thing is a lot easier to have if you’re actually paid enough to be able to afford it.

It’s OK though. In order to make up for the lack of these two things, women do have a few things they have more of to offset it.

  1. Dudes who think they are entitled to their time for free
  2. Standards of quality they are expected to meet

…wait, that doesn’t make it better at all, does it?

The sad fact is that women with a public presence get given a much harder time than men do, and this transfers entirely to the open source world: People are extremely ready to police the quality of your work already. Being a woman turns this up to 11.

So with less time and less money to achieve quality in and higher standards they are likely to be expected to adhere to, is it any surprise that the percentage of women in open source is a lot lower than it is in tech overall (where it’s already bad)?

Open source culture in general, and this problem in particular, are not exclusively feminist issues. You absolutely can and do experience these problems as a man, and these problems could easily still exist in a fully gender equal or genderless society, but any understanding of how these problems manifest in and interact with the society we actually have will be incomplete without the structural analysis of privilege and its interaction with gender that feminism brings to the table.

If you want to read more about this subject, I recommend this great piece by Ashe Dryden: The ethics of unpaid labour and the OSS community. A lot of my thoughts and opinions around this were informed by it.


This entry was posted in Feminism, programming on by .