Changes to the Hypothesis example database

One of the most notable features of Hypothesis is the example database. Unlike classic Quickcheck, when Hypothesis finds an example it saves it in its database. The next time it needs an example of that type it tries the previously saved example before moving on to random generation.

The initial version of this was really limited though. The problem is that this happens at the falsify level, so for example a failure in @given(int, int) it would save the pair (int, int), and for attempts to falsify other things with @given(int, int) it would reuse that. However if you asked for a @given((int, int), str) for example it would not reuse the previous example.

The result is that you don’t get to share any interesting features between examples – only exact matches. This seems a shame.

This turns out to be fixable, and the fix is both fairly straightforward but also quite interesting in that it only really works because of the ridiculously dynamic nature of Hypothesis.

Conceptually all that happens is that whenever you save something you also save all the subcomponents, and whenever you generate something you have a chance of generating any previously saved examples for it. It’s easy to say, but the details are a little complicated. Maybe there’s a better way to do it, but the approach I ended up taking is very specific to Hypothesis.

The main component is that Hypothesis’s weird little home brewed version of type classes are first class values in the language. Both the “types” that it maps from and the values it produces are just python values, and can be manipulated as such.

We use this in two ways: First, Hypothesis strategies grew a decompose method. This iterates over (descriptor, value) pairs. So for example when saving (1, “foo”) as an (int, str), this would yield (int, 1) and then (str, “foo”), and we would then save those values for those types.

(There’s a slight wart here. Sometimes these values decompose as non-serializable types, so we need to check for that case and not save when that happens).

So that’s the saving, but what next? How do we get these values back out? This is particularly important because arguments to @given are generated as a single strategy, so these will never actually help – a @given(int) isn’t using the strategy for int, it’s using the strategy for ((int,), {}) (one positional arg, no keyword args).

The next step takes advantage of the fact that the strategies are first class values, with a little bit of an assist from the parametrized data generation. We introduce the idea of an ExampleAugmentedStrategy. This is a strategy that wraps another strategy with some examples that could have come from it. It’s essentially biasing a strategy towards some favoured examples. Sometimes it picks from those examples, sometimes it generates some fresh ones (which ones, and the probability of generating them, are decided by the parameter value).

The strategy table is then modified to look up previous examples and if any are found return an augmented strategy.

Which brings us to the desired end point: The examples we previously extracted can now be reassembled into other examples, allowing Hypothesis to essentially learn pieces of the problem from other tests.

In theory this should allow the construction of much harder to find examples. In practice I haven’t yet seen all that much benefit from it yet, because I’m still figuring out what a good workflow for the database is (I don’t really have a way to save it between travis runs, which makes it of limited  use), but I think it’s starting to show a lot of promise.

This entry was posted in Hypothesis, Uncategorized on by .