Author Archives: david

Hypothesis 1.10.0 is out

The general consensus when I asked about this, which was conveniently what I was already planning to do, was to save Hypothesis 2.0 for a big release where I wanted to make some significant changes and was happy to cut the deprecated APIs and such at that point. So for now we’re continuing on with the 1.x series.

This release actually doesn’t have any new features externally. I’ve introduced a new ‘cleanup’ function, which lets you register cleanup handlers that execute at the end of your tests (this is more useful inside strategies than it is inside tests, but you can use it in either), but its semantics as it interacts with find are a bit weird and so I’m not currently declaring it part of the public API.

Other than that, this is mostly a bug fix and performance release. It was supposed to be a small release which just contained the performance improvements to one_of that I blogged about previously, but it got held up by a pypy bug (I believe this is the first time that’s happened – previous bugs have been easy to diagnose and work around) and then snowballed into a bit more significant of a release.

Also, reminder that I have a Patreon now, and that if you like this release schedule some support for it would be appreciated.

This is just a bugfix and performance release, but it changes some semi-public APIs, hence the minor version bump.

  • Significant performance improvements for strategies which are one_of() many branches. In particular this included recursive() strategies. This should take the case where you use one recursive() strategy as the base strategy of another from unusably slow (tens of seconds per generated example) to reasonably fast.
  • Better handling of just() and sampled_from() for values which have an incorrect __repr__ implementation that returns non-ASCII unicode on Python 2.
  • Better performance for flatmap from changing the internal morpher API to be significantly less general purpose.
  • Introduce a new semi-public BuildContext/cleanup API. This allows strategies to register cleanup activities that should run once the example is complete. Note that this will interact somewhat weirdly with find.
  • Better simplification behaviour for streaming strategies.
  • Don’t error on lambdas which use destructuring arguments in Python 2.
  • Add some better reprs for a few strategies that were missing good ones.
  • The Random instances provided by randoms() are now copyable.
  • Slightly more debugging information about simplify when using a debug verbosity level.
  • Support using given for functions with varargs, but not passing arguments to it as positional.
This entry was posted in Hypothesis, Python on by .

New Patreon page for work on Hypothesis

After a bunch of people have suggested it, I’ve created a Patreon to fund ongoing work on Hypothesis. Rewards include receiving in depth weekly updates about Hypothesis – news, advice on how to use it, etc. That probably means there will be a bit less of such things here.

I don’t think this will be enough to keep me in food and drink, let alone rent, and I still have ongoing plans for commercial development of things built on top of Hypothesis. This is purely to fund the development of the open source core library, and to give me enough money that I can feasibly keep going until I’ve got a more solid plan on that front.

So, if you like my work on Hypothesis (or even on here)! I’d really appreciate some funding. I’ve put a lot of work into this, and it sure would be nice to see something back for that.

This entry was posted in Hypothesis, Python on by .

Massive performance boosts with judicious applications of laziness

I noticed about two hours after releasing that there’s a massive performance problem with the recursive data implementation I shipped in Hypothesis 1.9.0. Obviously this made me rather sad.

You have to do something slightly weird in order to hit this: Use recursive data as the base case for another recursive data implementation (actually using it in the expansion would probably work too).

The reason for this turns out to be nothing like what I expected and is kinda interesting, and I was really stuck as to how to solve it until I realised that with a little bit of lazy evaluation the problem was literally trivial to solve.

We’ll need to look at the implementation a bit to see what was happening and how to fix it.

Internally, Hypothesis’s recursive data implementation is just one big union of strategies. recursive(A, f) is basically just A | f(A) | f(A | f(A)) | … until you’ve added enough clauses that any subsequent one will basically never get in under the limit.

In order to understand why this is a problem, you need to understand a bit about Hypothesis generates data. It happens in two stages: The first is that we draw a parameter value at random, and the second is we pass that parameter value in to the strategy again and draw a value of the type we actually want (actually it’s more complicated than that too, but we’ll ignore that). This approach gives us higher quality data and lets us shape the distribution better.

Parameters can be literally any object at all. There are no valid operations on a parameter except to pass it back to the strategy you got it from.

So what does the parameter for a strategy of the form x | y | … look like?

Well, it looks like a weighting amongst the branches plus a parameter for each of the individual values. You pick a branch, then you feed the parameter you have for that branch to the underlying strategy.

Notably, drawing this parameter requires drawing a parameter from each of the underlying strategies. i.e. it’s O(n) in the number of branches.

Which means that if you have something like the recursive case above, you’re doing O(n) operations which are each themselves O(n), and you’re accidentally quadratic. Moreover it turns out that the constant factor on this may be really bad.

But there turns out to be an easy fix here: Almost all of those O(n^2) leaf parameters we’re producing are literally never used – you only ever need the parameter for the strategy you’re calling.

Which means we can fix this problem with lazy evaluation. Instead of storing a parameter for each branch, we store a deferred calculation that will produce a parameter on need. Then when we select a branch, we force that calculation to be evaluated (and save the result in case we need it again) and use that. If we never need a particular parameter, we never evaluate it.

And this means we’re still doing O(n) work when we’re drawing the parameter for a branch, but we’re only doing O(1) work per individual element of the branch until we actually need that value. In the recursion case we’re also saving work when we evaluate it. This greatly reduces the amount of work we have to do because it means that we’re now doing only as much work as we needed to do anyway to draw the template and more or less removes this case as a performance bottleneck. It’s still a little slower than I’d like, but it’s in the category of “Hypothesis is probably less likely to be the bottleneck than typical tests are” again.

In retrospect this is probably obvious – it falls into the category of “the fastest code is the code that doesn’t execute” – but it wasn’t obvious to me up front until I thought of it in the right way, so I thought I’d share in case this helps anyone else.

This entry was posted in Hypothesis, Python on by .

Hypothesis 1.9.0 is out

Codename: The great bundling

This is my favourite type of release: One which means I stop having to look embarrassed in response to common questions.

Here’s the changelog entry:

Codename: The great bundling.

This release contains two fairly major changes.

The first is the deprecation of the hypothesis-extra mechanism. From now on all the packages that were previously bundled under it other than hypothesis-pytest (which is a different beast and will remain separate). The functionality remains unchanged and you can still import them from exactly the same location, they just are no longer separate packages.

The second is that this introduces a new way of building strategies which lets you build up strategies recursively from other strategies.

It also contains the minor change that calling .example() on a strategy object will give you examples that are more representative of the actual data you’ll get. There used to be some logic in there to make the examples artificially simple but this proved to be a bad idea.

“How do I do recursive data?” has always been a question which I’ve had to look embarrassed about whenever anyone asked me. Because Hypothesis has a, uh, slightly unique attitude to data generation I’ve not been able to use the standard techniques that people use to make this work in Quickcheck, so this was a weak point where Hypothesis was simply worse than the alternatives.

I got away with it because Python is terrible at recursion anyway so people mostly don’t use recursive data. But it was still a bit of an embarrassment.

Part of the problem here is that you could do recursive data well enough using the internal API (not great. It’s definitely a bit of a weak point even there), but the internal API is not part of the public API and is decidedly harder to work with than the main public API.

The solution I ended up settling on is to provide a new function that lets you build up recursive data by specifying a base case and an expansion function and then getting what is just a fixed point combinator over strategies. In the traditional Hypothesis manner it uses a bizarre mix of side effects and exceptions internally to expose a lovely clean functional API which doesn’t let you see any of that.

The other embarrassing aspect of Hypothesis is how all the extra packages work. There are all these additional packages which have to be upgraded in lockstep with Hypothesis and behave like second class citizens – e.g. there have never been good changelogs and release announcements for them. It’s a common problem that people fail to upgrade one and get confusing error messages when things try to load and don’t work.

This release merges these all into Hypothesis core and leaves installing their dependencies up to you. You can install those dependencies using a setuptools extra, so e.g. installing hypothesis[django] will install Hypothesis + compatible versions of all the dependencies – but there’s no checking of versions etc. when you use them. I may add that checking if it turns out to be a major problem that people try to use these with the wrong version of dependencies, but I also may not. We’ll see.

This entry was posted in Hypothesis, Python on by .

A terrible/genius idea

In a conversation with Mark Jason Dominus and Pozorvlak on Twitter about GHC compile error messages I realised there was a common pattern of problem:

Compile errors are confusing, but they are confusing in predictable ways. The same error message that is completely unintuitive to a human will tend to be emitted in the same pattern of error.

Writing good error messages is hard, but this is the sort of thing which can be debugged better by a computer than a human. A simple bayesian text classifier can probably map these error messages extremely well to a diagnostic suggestion, and sometimes all you need is those suggestions to put you on the right path.

Moreover we can crowdsource gathering the data.

Here is a plan. A simple wrapper script which you can alias to any program you like and it will execute that program, passing all the arguments through.

If it ever gets a non-zero exit code it looks at the output of stderr and attempts to run it through a classifier. It then says “Hey, I think this could be one of these problems”.

If soon after it sees you run the program again with exactly the same arguments and it now gets a success, it says “Great, you fixed it! Was it one of the errors we suggested? If not, could you provide a short diagnostic?” and submits your answer back to a central service. The service then regularly builds machine learning models (one per base command) which it ships back to you on demand / semi-regularly.

You need some additional functionality to prevent well-poisoning and similar, but I think the basic concept wouldn’t be too hard to get something up and running with.

I’m not going to do it though – I’ve more than enough to do, and this doesn’t actually help with any problems I currently have regularly. If anyone wants to take the idea and run with it, please do so with my blessing.

This entry was posted in Uncategorized on by .