Author Archives: david

Revising some thoughts on test driven development

Epistemic status: Still thinking this through. This is a collection of thoughts, not an advocacy piece.

I’ve previously been pretty against TDD. It is possible that this has always been based on a straw understanding of what TDD is supposed to be for, but if so that is a straw understanding shared by a number of people who have tried to sell me on it.

I am currently moving towards a more nuanced position of “I still don’t think TDD is especially useful in most cases but there are some cases where it’s really amazingly helpful”.

Part of the source of my dislike of TDD has I think come from underlying philosophical differences. A test suite has two major purposes:

  1. It helps you prevent bugs in your code
  2. It acts as executable documentation for your code

As far as I am concerned, the important one is the first. People who think their test suite is a good substitute for documentation are wrong. Your code is not self-documenting. If you haven’t written actual for reals documentation, your code is not documented no matter how good your test suite is.

And my belief has always been and remains that TDD is actively harmful for using a test suite as the first purpose. Good testing is adversarial, and the number one obstacle to good testing (other than “not testing in the first place”) is encoding the same assumptions in your tests as in your code. TDD couples writing the tests and the code so closely that you can’t help but encode the same assumptions in them, even if it forces you to think about those assumptions more clearly.

I am aware of the counter-argument that TDD is good because it ensures your code is better tested than it otherwise would be. I consider this to be true but irrelevant, because mandating 100% coverage has the same property but forces you to maintain a significantly higher standard of testing.

So if TDD is harmful for the purpose of testing that matters, it must be at best useless and at worst harmful, right?

As far as I’m concerned, right. If your goal is a well tested code base, TDD is not a tool that I believe will help you get there. Use coverage instead.

But it turns out that there are benefits to TDD that have absolutely nothing to do with testing. If you think of TDD as a tool of thought for design which has absolutely nothing to do with testing whatsoever then it can be quite useful. You then still have to ensure your code is well tested, but as long as you don’t pretend that TDD gets you there, there’s nothing stopping you from using it along the way.

Using tests to drive the design of your API lets you treat the computer as an external brain, and provides you with a tool of thought that forces you to think about how people will use your code and design it accordingly.

The way I arrived at finally realising this is via two related design tools I have recently been finding very useful:

  1. Writing documentation and fixing the bits that embarrassed you when you had to explain them
  2. Making liberal use of aspirational examples. Starting a design from “Wouldn’t it be great if this worked?” and see if you can make it work.

TDD turns out to be a great way of combining both of these things in an executable (and thus checkable) format:

  1. The role of tests as executable documentation may not actually be a valid substitute for documentation, but it happily fills the same niche in terms of making you embarrassed when your API is terrible by forcing you to look at how people will use it.
  2. A test is literally an executable aspirational example. You start from “Wouldn’t it be great if this test passed?” and then write code to make the test pass.

When designing new segments of API where I’ve got the details roughly together in my head but am not quite clear on the specifics of how they should all fit together or how this should work, I’ve found using tests for this can be very clarifying, and this results in a workflow that looks close to, but not exactly like, classic TDD.

The workflow in question is as follows:

As per classic TDD, work is centered around features. For example, if I was designing a database API, the following might be features:

  1. I can connect to the database
  2. I can close a connection to the database
  3. I can create tables in the database
  4. I can insert data into the database
  5. I can select data from the database

Most of these are likely to be a single function. Some of them are probably two or three. The point is that as with classic TDD you’re focusing on features not functions. I think this is a bit more coarse grained than advocated by TDD, but I’ve no idea how TDD as she is spoken differs from TDD as described.

Working on an individual feature involves the following:

  1. Start from code. All types and functions you think you’ll need for this stage are defined. Functions should all raise some error. InvalidArgument or similar is a good one, but any fatal condition you can reasonably expect to happen when calling that function is fine. If there is really no possible way a function could raise an exception, return some default value like 0, “” or None.
  2. Write lots of tests, not just one, for all the cases where those functions should be failing. Most of these tests should pass because e.g. they’re asserting that your function raises an invalid argument when you don’t specify a database to connect to. Your function considers all arguments to be invalid, so this test is fine!
  3. Any tests for error conditions that do not currently pass, modify the code to make them pass. This may require you to flesh out some of your types so as to have actual data.
  4. Now write some tests that shouldn’t error. Again, cover a reasonable range of cases. The goal is to sketch out a whole bunch of example uses of your API.
  5. Now develop until those tests pass. Any edge cases you spot along the way should immediately get their own test.
  6. Now take a long hard look at the tests for which bits of the API usage are annoying and clunky. Improve the API until it does not embarrass you. This may and probably will require you to revise earlier stages as well and that’s fine.
  7. If you’re feeling virtuous (I’m often not and leave this to the end) run coverage now and add more tests until you reach 100%. You may find this requires you to change the API and return to step 5.

Apply this to each stage in turn, then apply a final pass of steps 6 and 7 to the thing as a whole.

This isn’t very different from a classic TDD work flow. I think the units are more coarsely grained, and the emphasis on testing error conditions first means that you tend to start with tests which are passing and act as constraints that they should stay passing rather than tests that are failing and which act as drivers to make them pass, but it’s say no more than a standard deviation out from what I would expect normal TDD practice to look like.

The emphasis on error conditions is a personal idiosyncrasy. Where by personal idiosyncrasy I mean that I am entirely correct, everyone else is entirely wrong, and for the love of god people, think about your error conditions, please. Starting from a point of “Where should this break?” forces you to think about the edge cases in your design up front, and acts as something of a counterbalance to imagining only the virtuous path and missing bugs that happen when people stray slightly off it as a result.

So far this approach has proven quite helpful for the cases I’ve used it. I’m definitely not using this for all development, and I wouldn’t want to, but it’s been quite helpful where I need to design a new API from scratch and the requirements are vague enough that it helps to have a tool to help me think tem through.

This entry was posted in Hypothesis, Uncategorized on by .

The war cannot be won, yet still we must fight our little battles

I’ve only half jokingly referred to 2015 as the year I declare war on the software industry.

You could make a David and Goliath analogy here, but the thing is that instead of my felling the giant with my plucky shepherd’s weapon, what’s actually going to happen is that Goliath is going to put his hand on my forehead, hold me out at arm’s length, and laugh as I struggle ineffectually against his vastly superior strength and size.

But maybe, just maybe, if I struggle hard enough and push with all my might, I can get Goliath to take a single step back.

Hypothesis was my opening volley, as an attempt to raise the benchmark for quality – if I make it easy enough to find your bugs, maybe you’ll fix them?

As an opening volley, it’s a pretty weak one. Most of the reason why software is bad isn’t because it’s too hard to write tests, it’s because of social reasons – people are so conditioned to bad software at this point that it’s just not that much of a problem to release broken software into the world because people will use it anyway.

But maybe if we make it easier for the people who care to write quality software on time and on budget, we can start to change the norms. If you can choose between two equally shiny and feature-full pieces of software except that one actually works properly, perhaps you’ll start to care more about software quality, and if your customers start to desert you for the software that actually works, maybe you’ll really start to care about software quality.

Hypothesis alone will never achieve this, but each tool gives Goliath a nudge in the right direction.

Then, tired of the burden of free labour we put on people as an industry, I wrote it’s OK for your open source software to be a bit shitty.

Will it change minds? Maybe a few. The responses on the corresponding reddit thread were really a lot higher quality than I would generally expect of reddit. It certainly seemed to help a whole bunch of people who were concerned about the quality of their own open source work, and hopefully it has given people a weapon to defend themselves when dealing with people who feel entitled to their free labour.

Then I wrote the Two Day Manifesto, an attempt to attack the problem from the other end. If the problem is that companies are built on free labour, maybe they could contribute some paid labour back?

Probably not, but again maybe we can nudge Goliath in the right direction.

Because ultimately all of these efforts are mostly there in the hope that I’ll find just the right way to push the giant, or that I will manage to push at the same time as enough other people, and that maybe he will take just a single step back.

And then someone else, or more likely some other group, will make him take another step back.

And over time he will retreat to the horizon, and we will follow, still pushing.

The horizon retreats ever into the distance, but we can look over our shoulders and see how much territory we’ve reclaimed from the giant, and we will redouble our efforts and push further.

And maybe, maybe, one day we will circle the world, and come back to where we started, and he will meet our forces on the other side and discover he no longer has anywhere to go.

But that day is far in the future. We will never see it, nor will the next generation. We will not win this war, and perhaps we never will.

But we’re still going to push.

This entry was posted in Hypothesis, Uncategorized on by .

Surprise! Feminism.

So yesterday’s article on it being OK to write shitty open source has had thirty thousand views. Most of these came from it hanging out at the top of r/programming for nearly 24 hours.

Which… wow. Another four thousand (which it will probably manage today) and it will be twice as popular as my next most popular articles, which have respectively had a nation wide referendum and years of being a wikipedia link from a popular article to drive up their traffic.

Of those thirty thousand, I’d bet you decent money that twenty thousand minimum thought it was purely about software aphorisms like “release early, release often” and “worse is better” (it’s not about worse is better. Stop trying to make it about worse is better) and didn’t notice that it was not even subtly coded feminism.

“Wait, what? How was it feminism? You didn’t even mention gender!” says my old friend, the suspiciously convenient anonymous voice, currently acting as an expy for a whole bunch of dudes who would have been happy to do the job.

You are correct, suspiciously convenient anonymous voice. I did not mention gender, and in a perfectly egalitarian feminist utopia the piece could have stood on its own without any feminist undertones to speak of.

But we don’t live in that utopia.

As I mentioned, there are two things that you need in order to produce quality software for free:

  1. Time
  2. Money

Anyone want to take a guess what women on average have much less of? Anyone? Anyone?

Did you guess “Both of these things”? Well done!

In our society, a far greater burden of free labour is placed on women (Terminology note: I think everything I say about women in this piece also applies to people who are not women but are perceived as such). They are more likely to be expected to do child care, more likely to be expected to do house work, more likely to be expected to provide free emotional labour in the form of support and favours to others.

And, as we established, labour takes time. So all of this extra labour women are being expected to do cuts into the time they have to do other things, like open source.

But wait, there’s more! Women are also paid less (exact reports for how much differ, so I’m not going to mention a figure for people to derail the point by quibbling about). The freedom to take time off to work on a thing is a lot easier to have if you’re actually paid enough to be able to afford it.

It’s OK though. In order to make up for the lack of these two things, women do have a few things they have more of to offset it.

  1. Dudes who think they are entitled to their time for free
  2. Standards of quality they are expected to meet

…wait, that doesn’t make it better at all, does it?

The sad fact is that women with a public presence get given a much harder time than men do, and this transfers entirely to the open source world: People are extremely ready to police the quality of your work already. Being a woman turns this up to 11.

So with less time and less money to achieve quality in and higher standards they are likely to be expected to adhere to, is it any surprise that the percentage of women in open source is a lot lower than it is in tech overall (where it’s already bad)?

Open source culture in general, and this problem in particular, are not exclusively feminist issues. You absolutely can and do experience these problems as a man, and these problems could easily still exist in a fully gender equal or genderless society, but any understanding of how these problems manifest in and interact with the society we actually have will be incomplete without the structural analysis of privilege and its interaction with gender that feminism brings to the table.

If you want to read more about this subject, I recommend this great piece by Ashe Dryden: The ethics of unpaid labour and the OSS community. A lot of my thoughts and opinions around this were informed by it.

 

This entry was posted in Feminism, programming on by .

Interviewing: Test, don’t sample

Do you ask for a code sample when interviewing someone?

Don’t. It’s a terrible idea. It creates stress and doesn’t give you any useful answers.

Seeing code they’ve written is obviously good and useful, but the way to do this is not to ask for a sample, it’s to set them a small task (something that shouldn’t take more than an hour or two) and ask them to code a solution to it.

Sure, for some people it will take more time, but for most this will be less stressful and for you it will be infinitely more useful.

Why it is less stressful:

  1. It puts everyone on an equal footing. Some people can’t give you a recent coding sample because everything they’ve written recently is under NDA.
  2. They are not trying to guess what you’re looking for because you’ve said what you are looking for. They don’t need to guess whether you’d prefer something cute and clever or boring but well tested. They don’t need to spend ages sorting through a bunch of code they’ve written trying to figure out what will best fit your subjective requirements.

Why it is better for you:

  1. Less stressed candidates produce more representative answers.
  2. You have more control over what you are testing for, and can refine this over time.
  3. Any question where you can’t compare the answer between candidates is a waste of your time and theirs because it’s so subjective and poorly calibrated that you might as well just toss a coin. You can compare coding tests, you can’t compare coding samples.

Code samples: Bad for the candidate, and worse for you. Just say no.

This entry was posted in Hiring, programming on by .

It’s OK for your open source library to be a bit shitty

(Content note: I normally try to keep my natural level of profanity slightly under control on this blog. I won’t be doing that in this post)

The major reason I wrote Hypothesis is to destroy shitty software. Everything is terrible, and I want it to be less terrible.

But that random code you threw together as a hack, stopped when it did what you needed to do, threw it up on pypi and then neglected it?

That’s totally OK. Thanks for writing it. The world is slightly better for your having done so, and there is no burden of expectation on you to “do a better job”. It’s not a job after all, is it? We’re not paying you to do it.

And this is what it ultimately comes down to.

I flatter myself that one of the things that I can legitimately claim about Hypothesis is that it is high quality. So far the worst bug that anyone has reported in the 1.0 release is that when given a wrong argument, Hypothesis throws an exception with the right type from the right place with the wrong error message. Hypothesis works on OSX, Linux, Windows, probably *BSD (I heard about some packaging issues with sqlite, but nothing since, so I think it worked once those were resolved), Python 2.7, 3.2-3.4, CPython, Pypy… It’s got 100% branch coverage, is documented, etc. Basically as far as quality goes Hypothesis does almost everything right.

So I’ve proved it can be done and that you should do it too, right?

Nah, that’s bullshit. I’m here to tell you as someone who has done the work of producing quality software for free, you don’t have to and you shouldn’t feel bad about not doing so.

Want to know how I did everything right in Hypothesis? I mean, obsessive attention to detail and high standards helped, and may even have been essential, but they weren’t even close to sufficient. Really there are two things that were the key ingredients to my making Hypothesis the quality piece of software it is today:

  1. Time
  2. Money

I’ve put somewhere in the region of 800 hours of work into Hypothesis this year, entirely for free. That’s what it took to get to this level of quality.

And I could only do this because I had the time and money to do so. I had the time to do so because I was being obsessive, had no dependents, and didn’t have a job. I could only not have a job because of the money. I only had the money because I spent the latter half of last year with double the salary I was used to, half the living expenses I was used to, and too borderline depressed to spend it on anything interesting.

These are not reasonable requirements.

Could I have done Hypothesis in less than 800 hours? Probably. I doubt I could have done it in less than 400 though, and I would be foolish to expect I could do any project in the smallest amount of time I could feasibly do it in.

Hypothesis is a large and complicated project though (if it doesn’t look like one, that’s because a lot of those 800 hours were spent on making it easy to use). Most projects are probably an order of magnitude simpler.

i.e. only 80 hours.

i.e. only you having to take two weeks off work, working literally for free, in order to produce quality software.

i.e. nearly half your holiday allowance if you live in a civilized country, or possibly more than your holiday allowance if you live in the US.

This is still not a reasonable requirement.

Can you produce quality software in less time than that, working only in your free time? I doubt it. Free time is inherently less productive. You’re tired and it’s fragmented. You spend an hour one evening trying to figure out why your windows builds are now failing because a new version of pip is released and after that I guess you could put another half hour in but your heart isn’t really in it and you’d basically not have the time to get properly stuck into it. The bar for quality is high, and the obstacles to it are higher, and there’s really nothing you can do to fix that other than to put in the time.

Don’t get me wrong. If you can put in the time I will be incredibly grateful to you. I just don’t think you should feel bad if you don’t.

There is no obligation to free labour. Every hour you put in working on your project for free is a gift to the world. If the world comes back to you and says “You are a bad person for not supporting this thing I need you to support” then fuck them. If they want that they should pay you for it, or do it themselves.

(Edit to note: This isn’t of course to say that you shouldn’t ask for features on open source projects. Only that you are not entitled to them. If you politely ask, that’s fine. If the author then says “Sorry, no, I don’t have the time / am not interested / literally any other reason at all”, that’s fine too)

Note: If you liked this piece, there is a follow up you may wish to read.

This entry was posted in programming on by .