So I’ve just released Hypothesis 0.4. It implements the example database I discussed in The Pain of Randomized Testing. Every time Hypothesis finds a failure it will save it for later use, and every property you test in Hypothesis will use the saved database to reproduce any failures it’s seen before (and sometimes find new ones, because it allows test cases to cross between different properties). This is pretty cool.
It has been almost exactly two weeks since I wrote my short term roadmap. In that time I’ve knocked out two of the features I wanted to get done: The example database and the improved data generation. I’ve also seriously improved the quality of the testing of Hypothesis itself and found a whole bunch of bugs as a result. So things are going pretty well.
But I’m away next week, and I’m basically giving myself until the end of February to get 1.0 out. So it’s time to get serious about a concrete plan for the road map.
So the following are what I’m declaring requirements for 1.0:
- A stable API
- Solid documentation
- The ability to get reports about a run out – number of examples tried, number rejected, etc.
- Coverage driven example discovery
I could maaaaybe drop one of the latter two but not both, but I’m very reluctant to do so. The first two are hard requirements. If it doesn’t have a stable well documented API then it’s not 1.0. This isn’t one of those clownshoes open source hack projects where the specified API is whatever the code happens to implement and the documentation is the source code.
From the original list I’m explicitly taking testmachine off the table for now. I want to do it, and I will do it, but I’m officially relegating it to the giant pile of “Oooh wouldn’t it be neat if- gah, no David, you can do that after 1.0″ along with “generating strings matching a regular expression”, “automatically generating Django model object graphs” and “driving Selenium from hypothesis” and a whole bunch of other ideas. It will most likely be the 1.1 flagship feature.
A top level driver for Hypothesis will probably shake out of the wash as part of the reporting work. We’ll see how that goes. It’s neither a requirement nor an explicit not-for-release.
This leaves what should be an achievable amount of work. I have prototypes demonstrating the feasibility of the coverage driven example discovery, so based on past experience it will probably take me about two hours to actually get it hooked in and most of a week to get it solid enough that I’m happy with it. Reports should be a bit less than that.
Which gives me about two weeks to stabilise the API and write documentation. I think I can do that. I’ve been doing some of it as I go, and will do more of it as I implement the other two. The main difficulty I have here is basically deciding how much of Hypothesis’s internal weirdness should be exposed as API and how much I should hide behind a more pythonic surface. I suspect I’m going to go full weird and just tidy it up so that it’s as nice to use as possible.
So that’s the plan. I’m not getting everything I wanted to get done, but if I had then the end result wouldn’t be nearly as solid. I’m opting to strike a balance between production worthy and interestingly novel, and I think this is pretty much the right one.
And it’s a pretty exciting place to be. Right now it’s teetering there already, but if all this goes according to plan I’m basically prepared to state that by the end of February randomized property based testing will have three categories: Quviq’s Quickcheck, Hypothesis, and everything else. Because the former is proprietary (even the documentation!) I don’t really know what it’s capable of, but Hypothesis’s capabilities are going to be significantly beyond anything that seems to exist in the open source world.