Hypothesis 1.7.1 is out

(Note: I’ve realised that this blog has a much higher number of interested readers than the mailing list does, so I’m going to start mirroring announcements here)

As of this past Monday, Hypothesis 1.7.1 (Codename: There is no Hypothesis 1.7.0) is out.

The main feature this release adds is Python 2.6 support. Thanks hugely to Jeff Meadows for doing most of the work for getting this in.

Other features:

  • Strategies now has a permutations() function which returns a strategy yielding permutations of values from a given collection.
  • if you have a flaky test it will print the exception that it last saw before failing with Flaky, even if you do not have verbose reporting on.
  • Slightly experimental git merge script available as “python -m hypothesis.tools.mergedbs”. Instructions on how to use it in the docstring of that file.

This also contains two important counting related bug fixes:

  • floats() with a negative min_value would not have worked correctly (worryingly, it would have just silently failed to run any examples). This is now fixed.
  • tests using sampled_from would error if the number of sampled elements was smaller than min_satisfying_examples.

It also contains some changes to filtering that should improve performance and reliability in cases where you’re filtering by hard to satisfy conditions (although they could
also hurt performance simply by virtue of enabling Hypothesis to find more examples and thus running your test more times!).

This should be a pretty safe upgrade, and given the counting bugs I would strongly encourage you to do so.

This entry was posted in Uncategorized on by .

Back-pedalling on Python 2.6

I wrote a while back that I wouldn’t support Python 2.6 unless someone paid me to. I still think this is an eminently reasonable position which I encourage most people to take.

Unfortunately I am not going to be taking it.

The problem is that I am in the course of trying to put together a business around Hypothesis, offering consulting and training around using it. This is a much easier sell to large companies, and it’s a much easier sell to companies who are already using it at least somewhat.

Unfortunately, not supporting Python 2.6 out of the box means that many large companies who might otherwise become customers cannot start using Hypothesis until they’ve paid me money, which is a much harder sell. This way in order to get a company to start using Hypothesis I only need to convince developers that it’s great, which is easy because it is,

If it were hard to support Python 2.6 I probably still wouldn’t have done so until there were some serious demand for it, but fortunately Jeff Meadows was extremely kind and did most of the hard work. It turned out to be less hard work than I thought – I think the last time I tried there was still a bunch of strange behaviour that was hard to port that I have since removed – but it was nevertheless a highly appreciated contribution.

So, at some point in the next few days we should be seeing a new minor version of Hypothesis with support for 2.6, as well as a few other minor bug fixes and new features.

If you’re just maintaining a small open source library in your spare time, I would still encourage you to not support 2.6 without a really strong financial incentive to do so, but I think for now it sadly makes sense for me to add the support for free.

This entry was posted in Hypothesis, Python on by .

Some stuff on example simplification in Hypothesis

One of the slightly thankless tasks I engage in in Hypothesis (subtype: People only notice if I don’t do my job well) is example simplification. I spend a lot of time on example simplification – far more than I think anyone else who writes a Quickcheck bothers with.

What is example simplification? Well, when Hypothesis tells you that an example that breaks your test is [1, 2, 3] then it almost certainly did not start out with that. It probably started out with a list about 20 elements long with numbers like 349781419708474 in it, and then prune it down to [1, 2, 3].

Example simplification is important because random data is very messy but there is usually a simple example that demonstrates the bug, and the latter is helpful but the former rarely is.

Simplification is of course not new to Hypothesis – it’s been in Quickcheck for 15 years – but Hypothesis’s example simplification is atypically sophisticated.

So, why so sophisticated? Why is the normal Quickcheck approach not good enough?

Well the official reasons are as follows:

  1. Hypothesis generates much more complex examples than Quickcheck. e.g. Quickcheck will generate a long list of small integers or a short list of large integers, but will never generate a long list of long integers.
  2. Tests in Hypothesis tend to be slower, due to a mix of Python performance and Python idioms.

The unofficial reasons are probably actually closer to being the true reasons:

  1. Including bad examples in documentation is embarrassing
  2. My test suites do a lot of example simplification so faster example simplification gives faster test suites gives a less stressed David
  3. I find combinatorial optimisation fun.

At any rate, the result is that I’ve spent a lot of time on this, so it seemed worth sharing some of the knowledge. I’ve done this in two places. The first is that I gave a talk at the London Big O user group. The slides are also available separately.

I was going to transcribe it, but actually I didn’t think it was a very good talk (I prepared it in a bit of a rush), so instead I’ve built an ipython notebook working through the concepts. This actually goes slightly further than the talk, as I figured out some things at the end that I need to merge back into Hypothesis proper.

This entry was posted in Hypothesis, programming, Python on by .

Thinking with the machine

Content note: Rambling and slightly incoherent.

When was the last time you got lost?

It used to be very easy for me to get lost. I have a terrible sense of direction. Now I have an excellent sense of direction. It’s a little black and glass oblong, fits in my pocket. I only really get lost if I’m out of power or data [edit: Or, as I discovered half an hour after writing this, driving and unable to access my phone].

I also tend not to forget commonly available information, because I can just type Google for the information.

It used to be the case that calculating pi was a life’s work. At the conference this weekend were complaining about how using Python it took seconds – sometimes even minutes – to get this sort of approximation.

If I want to think through a line of thought I can write it down and save it for later. Even raw, this is 50 times faster than doing so by hand, and offers unprecedented editing capabilities and is essentially free. This allows me to coherently put together more complicated thoughts than I would ever be able to do unaided.

In a very real sense I am vastly more intelligent than someone of even a hundred years ago, let alone a thousand. It’s not that I’m more intelligent in some biological sense (though due to advances in nutrition and healthcare this may be true too), but I am augmented by the world around me and the tools available to me in such a way as to greatly boost my natural capabilities.

There’s a joke in AI circles that artificial Intelligence is whatever hasn’t been done yet [irony: Counter-example to my above claim about forgetting things. Googling for this phrase just turns up irrelevant stuff about AI risk. I had to result to my other source of transactive memory]. If we have figured out how to make a computer do it then it’s just calculating. The same seems to hold true for natural intelligence – once upon a time, a good memory was considered the hallmark of intelligence. Now it’s just a thing you use your computer for.

So we’ve got the useful things the computers do and the actual intelligence that we leave to the humans.

But there is a middle way, and I think that that way is where the really exciting stuff lies.

If you looked at my examples above, you might have noticed that as the bird says, one of these things is not like the others.

When I navigate the computer is doing the work. When I Google, there is art in asking the right question, but answering the question is all the computer’s work.

With writing in order to think through a problem though, the computer isn’t really doing the work. I am. The computer is lending me its capabilities – the ones that aren’t “really” intelligence, but that somehow when you add them to “real” intelligence you get something greater.

Computers are good at many things that we are not. In this case I am more or less using the computer as a working memory, because my working memory is pretty good but still bounded, while the computer’s is effectively infinite (in that it’s finite, but it’s so much larger than mine that I hit my limits in terms of how I can offload to it long before we hit its limits). The result is that the ecosystem of me plus the computer is something greater than the sum of our parts – I can use the computer’s strengths to remove my weaknesses, and the result is something that I could not have produced unaided, and the computer certainly couldn’t have produced.

This is also how I think of Hypothesis.

People talk about Quickcheck, or Hypothesis, finding a bug in their software. This is not correct. Hypothesis does not find bugs, people do. Hypothesis sure helps though.

There is software that finds bugs without you having to do anything other than run it. e.g. static analysis tools fall into this category. This is not what property based testing does. In property based testing you are still the one writing the tests, you are still the one finding the bugs, the computer is just there to help you out at the bits you’re bad at by doing the thing that computers do best: Repeating the same task over and over again really quickly.

When I started this post I thought I was going to be introducing the concept of “transhumanist software tools”. Software tools that work by augmenting human intelligence in order to help us write better software. There are some tools that I think are unambiguously of this style: Property based testing, interactive theorem provers, IDEs (in particular autocomplete).

But I think this is a wrong label. In much the same way that there is no such thing as a functional programming language, I don’t think there’s any such thing as a transhumanist software tool. It’s too fuzzy a category. Is a REPL transhumanist? Is a type system? The answer is obvious: “Kinda?”.

There is such a thing as transhumanist software development though: Software development where we lean heavily on the computer, and think in terms of how we can not just make the computer work for us but also with us.

And I think there’s a lot of potential to explore here. Right now we assume any task is either intrinsically human or is “automation”, where we just want to replace the people doing it with a small shell script, and the middle ground is really under explored.

Computers cannot write software (yet). But sometimes it feels like neither can humans. Perhaps together we can?

This entry was posted in Python, Uncategorized on by .

Hypothesis continues to teach me

I’ve learned a lot technically in my work so far on Hypothesis. It’s both taught me interesting computer science things and also has I think caused me to level up a lot as a developer. It’s been a great, if occasionally frustrating, experience and I expect it will continue to be one for some time yet.

But that’s not what I’m learning about right now. As you’ve probably noticed, and I mentioned previously, I’ve not been doing a huge amount of development recently. There have been a couple patch releases for bug fixes and example quality but nothing very serious. I have some interesting work going on behind the scenes on finding multiple bugs with one test, but it’s probably a while off yet.

Because right now what I’m learning about because of Hypothesis is

  • Public speaking
  • Marketing
  • Pricing and sales

You know, “fluffy stuff”.

I’m also learning how to basically suck it up and admit I want things. A combination of geek and English social failings makes it very hard for me to do that. So when I put out a new project or write a blog post there’s always this weird dance of “yeah I totally just did this for me. I guess you can retweet it if you like, maybe star it on github, but whatever I don’t really care” followed by staring obsessively at every notification about it.

With Hypothesis it’s different, because there’s no pretence. I want Hypothesis to be popular. It will make the world a better place, and potentially it will make me some money (or at least help me recoup the money I effectively burned by taking a sabbatical to make it).

And this is weird to me, because it’s basically forcing me out of my shell and making me develop the skills I’ve always shunned. Public Speaking is something I assumed I would never be good at (turns out that I’m actually pretty OK at it. Maybe with some practice I’ll even be good). Sales and marketing have always been things where… I knew abstractly that they weren’t intrinsically evil, but they always felt dirty and I didn’t really want to have anything to do with them. This wasn’t my reasoned and held position so much as my subconscious biases at work, but those are if anything harder to go against.

With Hypothesis, I need to figure out how to promote it if I want people to use it, and I do want people to use it, so I’m forced into a sales and marketing position. Moreover, talking about it to new groups is one of the best things I can do to promote it, so this in turn forces me into public speaking.

Moreover, it’s fairly unambiguously a good thing for me to ask for money for it. I know I’ve done great work in Hypothesis, and I want to continue doing great work in Hypothesis, but in order to do that I also need to eat, have a place to live, etc.

Moreover it’s clearly a bad thing for me to undercharge! As well as value of labour, etc. etc. it’s a bad thing simply because I’m mostly not charging for the open source development part, so if I’m undercharging that means I have to do more work that isn’t that in order to make decent money, which will in turn mean that less work that benefits everybody gets done.

Not undercharging turns out to be hard. I’ve had multiple conversations with friends to the tune of “I was thinking of charging £X?” “Um. No. It would be cheap at £2X.” “I guess I could charge £Y?” “MORE MONEY” “OK OK how about £Z?” “Yeah I guess you could start there and raise your prices later”.  I understand where these numbers come from, and my friends are right and I am wrong, but that’s sure not how it feels.

Ultimately this is proving to be an… interesting experience. It’s super uncomfortable, as I’m having to go against all my social instincts and unlearn a lot of bad habits, but I think it will be a good thing for me, and hopefully it will be a good thing for Hypothesis too.

This entry was posted in Python, Uncategorized on by .