The era of rapid Hypothesis development is coming to an end

Don’t panic. This is not an announcement of my abandoning the project. Hypothesis still has a long way to go, and I 100% intend to be working on getting it there.

What this is an announcement of is of my continued existence in a market economy and my tragic need to acquire currency in order to convert it into food and accommodation.

I haven’t been making a big deal of it, so some of you might be surprised to learn that the reason Hypothesis development has been so rapid for the last 6 months is that I’ve been working on it full time unpaid. It’s not so much that I took time off to write Hypothesis as that I had the time off anyway and I thought I’d do something useful with it. Hypothesis is that something useful.

I would love to continue working on Hypothesis full time. But the whole “unpaid” thing is starting to become not viable, and will become critically non-viable as soon as I move back to London.

So I’m going to need money.

I will do something more organised in the next month, but for now if you are a company or individual interested in paying me to do any of the following, I would very much like to hear from you:

  • Sponsored Hypothesis development (this can include paying for implementing specific features if you want)
  • Integration work getting Hypothesis to work well your testing environment
  • Training courses on how to use Hypothesis
  • Anything else Hypothesis related

If the above sounds interesting, please email me at [email protected].

If no money to continue working on Hypothesis is forthcoming, Hypothesis development will absolutely continue, but at a greatly reduced rate. The current development cycle is approximately a minor version a week. This will likely go down to at most a minor version every month, more likely a minor version every two. This would be a shame, as I have a bunch of exciting features I still want to work on, and then I need to tie everything together into a coherent 2.0 release. With full time work I would project that to happen end of this year, without I can’t really make any predictions at the moment.

This entry was posted in Hypothesis on by .

New discussion groups for randomized testing

One of the problems I had in creating Hypothesis was not really being able to ask “How do I do this?”. As as result I had to reinvent a lot myself. Some of the results of doing so are interesting, a lot of them just took five times as long as they needed to to reinvent something that was probably pretty well known.

Now I’m in the opposite boat. I do understand this subject pretty well at this point, and would be happy to help people out in building their own, but nobody knows to ask me unless they happen to have run into Hypothesis. And even now there’s a lot to improve on and it would be great to compare notes with other people.

So I’ve created a place to do that. There’s now a randomized testing google group, as well as a corresponding IRC channel at ##randomized-testing on freenode. They’re designed for anyone who is interested in the use and implementation of randomized testing: Anything from esoteric discussion of implementation strategies in shrinking to discussions about how to integrate randomization into your CI workflow to how to market randomized testing to people who aren’t convinced. Or anything else really.

If this is a subject that interests you, do come on by. They’re small at the moment, but hopefully shall grow.

This entry was posted in Uncategorized on by .

Constraint based fixtures with Hypothesis

A common experience when writing database backed web applications (which I hear is a thing some people like to do) is that rather than laboriously setting up each example in each test you use a set of fixtures – standard project definitions, standard users, etc.

Typically these start off small but grow increasingly unwieldy over time, as new tests occasionally require some additional detail and it’s easier to add little things to an existing fixture than it is to create one afresh.

And then what happens is that it becomes increasingly unclear which bits of those fixtures actually matter and which of them are just there because some other test happened to need them.

You can use tools like factory_boy to make this easier, but ultimately it’s still just making the above process easier – you still have the same problems, but it’s less work to get there.

What if instead of having these complicated fixtures your tests could just ask for what they want and be given it?

As well as its use in testing, Hypothesis has the ability to find values satisfying some predicate. And this can be used to create fixtures that are constraint based instead of example based. That is: You don’t ask for the foo_project fixture, you instead say “I want a project whose owner is on the free plan”, and Hypothesis gives it to you:

from hypothesis import find
from hypothesis.extra.django.models import models
from mymodels import Project, User
def test_add_users_to_free_project():
    project = find(
        models(Project, owner=models(User)),
        lambda x: x.owner.plan == "free")

And that’s basically it. You write fixture based tests as you normally would, only you can be as explicit as you like as to what features you want from the fixtures rather than just using what happens to be around.

It’s unclear to me whether this is ever an improvement on using Hypothesis as it is intended to be used – I feel like it might work better in cases where the assumptions are relatively hard to satisfy, and it’s probably better for things where the test is really slow – but what is the case is that it’s a lot less alien to people coming from a classical unit testing background than Hypothesis’s style of property testing is, which makes it a convenient gateway usage mode for people who want to get their feet wet in this sort of testing world without having to fundamentally change the way that they think in order to test trivial features.

There are a bunch of things I can do to make this sort of thing better if it proves popular, but all of the above works today. If it sounds appealing, give it a try. Let me know how it works out for you.

Edit to add: I’ve realised there are some problems with using current Hypothesis with Django like this unfortunately. Specifically if you have unique constraints on models you’re constructing this will not work right now. This concept works fine for normal data, and if there’s interest I’m pretty sure I can make it work in Django, but it needs some further thought.

This entry was posted in Hypothesis on by .

If you want Python 2.6 support, pay me

This post is both about the actual plan for Hypothesis and also what I think you should do as a maintainer of a Python library.

Hypothesis supports Python 2.7 and will almost certainly continue doing so until it hits end of life in 2020.

Hypothesis does not support Python 2.6.

Could Hypothesis support Python 2.6? Almost certainly. It would be a bunch of work, but probably no more than a few weeks, maybe a month if things are worse than I expect. It would also slow down future development because I’d have to maintain compatibility with it, but not to an unbearable degree given that I’ve already got the machinery in place for multiple version compatibility.

I’m not going to do this though. I’m doing enough free labour as it is without supporting a version of Python that is only still viable because a company is charging for commercial support for it.

I’m sorry, I misspoke. What I meant to say is that I’m not going to do this for free.

If you were to pay me, say, £15,000 for development costs, I would be happy to commit to providing 2.6 support in a released version of Hypothesis, followed by one year in which there is a version that supports Python 2.6 and is getting active bug fixes (this would probably always be the latest version, but if I hit a blocker I might end up dropping 2.6 from the latest and providing patch releases for a previous minor version).

People who are still using Python 2.6 are generally large companies who are already paying for commercial support, so I think it’s perfectly reasonable to demand this of them.

And I think everyone developing open source Python who is considering supporting Python 2.6 should do this too. Multi version library development is hard enough as it is without supporting 2.6. Why should you work for free on something that you are really quite justified in asking for payment for?

This entry was posted in Hypothesis, Uncategorized on by .

Speeding up Hypothesis simplification by warming up

This post might be a bit too much of Hypothesis inside baseball, but it’s something I did this morning that I found pretty interesting so I thought I’d share it. Hypothesis has a pretty sophisticated system for example simplification. A simplification pass looks roughly like

def simplify_such_that(search_strategy, random, value, condition):
    changed = True
    while changed:
        changed = False
        for simplify in search_strategy.simplifiers(random, t):
            while True:
                for s in simplify(random, value):
                    if condition(s):
                        changed = True
                        value = s
    return value

Instead of having a single simplify function we split our simplification up into multiple passes. Each pass is repeatedly applied until it stops working. Then we move onto the next one. If any pass succeeded at simplifying the value we start again from the beginning, else we’re done and return the current best value. This works well for a number of reasons, but the principle goal of it is to avoid spending time on steps that we’re pretty sure are going to be useless: If one pass deletes elements of a list and another simplifies them, we don’t want to try deleting again every time we successfully shrink a value because usually we’ve already found the smallest list possible (the reason we start again from the beginning if any pass worked is because sometimes later passes can unblock earlier ones, but generally speaking this doesn’t happen much). This generally works pretty well, but there’s an edge case where it has pathologically bad performance, which is when we have a pass which is useless but has a lot of possible values. This happens e.g. when you have large lists of complicated values. One of my example quality tests involves finding lists of strings with high unicode codepoints and long strictly increasing subsequences. This normally works pretty well, but I was finding it hit this pathological case sometimes and the test would fail because it used up all its time trying simplification passes that wouldn’t work because they were blocked by the previous step. This morning I figured out a trick which seems to improve this behaviour a lot. The idea is to spend a few passes (5 is my highly scientifically derived number here) where we cut each simplifier off early: We give it a few attempts to improve matters and if it doesn’t we bail immediately rather than running to the end. The new code looks roughly like this:

from itertools import islice
max_warmups = 5
def simplify_such_that(search_strategy, random, value, condition):
    changed = true
    warmup = 0
    while changed or warmup < max_warmups:
        warmup += 1
        changed = false
        for simplify in search_strategy.simplifiers(random, t):
            while true:
                simpler = simplify(random, value)
                if warmup < max_warmups:
                    simpler = islice(simpler, warmup)
                for s in simpler:
                    if condition(s):
                        changed = true
                        value = s
    return value

This seems to avoid the pathological case because rather than getting stuck on a useless simplifier we simply skip over it fairly quickly and give other simplifiers that are more likely to work a chance to shine. Then once the warmup phase is over we get to do the full simplification algorithm as before, but because we’ve already chewed it down to something much less complicated than we started with there isn’t as much of a problem – we tend not to have individually long simplifier passes because most of the really complex structure has already been thrown away. Empirically, for the test cases this was designed to improve this is more than an order of magnitude speed improvement (even when they’re not hitting the pathological case where they fail altogether), going from giving up after hitting the timeout at 30 seconds to completing the full pass in 3, and for everything else it merely seems about the same or a little better. So, yeah, I’m pretty pleased with this. This is definitely in the category of “Hypothesis’s example simplification is an overly sophisticated solution to problems that are mostly self-inflicted”, but who doesn’t like nice examples?

This entry was posted in Hypothesis, Uncategorized on by .