Conjecture, parametrization and data distribution

Up front warning: This is a very inside baseball post, and I’m the only person who plays this particular variant of the game. This blog post is mostly a mix of notes to self to and sharing my working.

I’m in the process of trying to rewrite the Hypothesis backend to use the Conjecture approach.

At this point the thing I was originally worried was intractable – shrinking of data – is basically solved. Conjecture shrinks as well as or better than Hypothesis. There are a few quirks to still pay attention to – the shrinking can always be improved, and I’m still on the fence as to whether some of the work I have with explicit costing and output based shrink control is useful (I think it’s probably not), but basically I could ship what I have today for shrinking and it would be fine.

However I’m discovering another problem: The other major innovative area of Hypothesis is its parametrized approach to data generation. More generally, I’m finding that getting great quality initial data out of Conjecture is hard.

This manifests in two major ways:

  1. It can be difficult to get good data when you also have good shrinking because you want to try nasty distributions. e.g. just generating 8 bytes and converting it to an IEEE 754 binary float representation produces great shrinking, but a fairly sub-par distribution – e.g. the probability of generating NaN is 1 in 2048 (actually very slightly lower).
  2. The big important feature of Hypothesis’s parametrization is correlated output. e.g. you can’t feasibly generate a list of 100 positive integers by chance if you’re generating each element independently. Correlated output is good for finding bugs.

1 is relatively easily solved by letting data generators participate in the initial distribution: Instead of having the signature draw_bytes(self, n) you have the signature draw_bytes(self, n, distribution=uniform). So you can let the floating point generator specify an alternative distribution that is good at hitting special case floating point numbers without worrying about how it affects distributions. Then, you run the tests in two modes: The first where you’re building the data as you go and use the provided distributions, the second where you’re drawing from a pre-allocated block of data and ignore the distribution entirely.

This is a bit low-level unfortunately, but I think it’s mostly a very low level problem. I’m still hoping for a better solution. Watch this space.

For the second part… I think I can just steal Hypothesis’s solution to some degree. Instead of the current case where strategies expose a single function draw_value(self, data) they can now expose functions draw_parameter(self, data) and draw_value(self, data, parameter). A normal draw call then just does strategy.draw_value(data, strategy.draw_parameter(data)), but you can use alternate calls to induce correlation.

There are a couple problems with this:

  1. It significantly complicates the usage pattern: I think the parametrization is one of the bits of Hypothesis people who look at the internals least understand, and one of the selling points of Conjecture was “You just write functions”. On the other hand I’m increasingly not sold on “You just write functions” as a good thing: A lot of the value of Hypothesis is the strategies library, and having a slightly more structured data type there is quite useful. It’s still easy to go from a function from testdata to a value to a strategy, so this isn’t a major loss.
  2. It’s much less language agnostic. In statically typed languages you need some way to encode different strategies having different parameter types, ideally without this being exposed in the strategy (because then strategies don’t form a monad, or even an applicative). You can solve this problem a bit by making parameters an opaque identifier and keeping track of them in some sort of state dictionary on the strategy, but that’s a bit gross.
  3. Much more care with parameter design is needed than in Hypothesis because the parameter affects the shrinking. As long as shrinking of the parameter works sensibly this should be OK, but this can become much more complicated. An example of where this gets complicated later.
  4. I currently have no good ideas how parameters should work for flatmap, and only some bad ones. This isn’t a major problem because you can fall back to a slightly worse distribution but it’s annoying because Conjecture previously had the property that the monadic and applicative interfaces were equivalently good.

Here’s an example of where parametrization can be a bit tricky:

Suppose you have the strategy one_of(s1, …, sn) – that is, you have n strategies and you want to pick a random one and then draw from that.

One natural way to parametrize this is as follows: Pick a random non-empty subset of {1, .., n}. Those are the enabled alternatives. Now pick a parameter for each of these options. Drawing a value is then picking a random one of the enabled alternatives and feeding it its parameter.

There are a couple major problems with this, but the main one is that it shrinks terribly.

First off: The general approach to shrinking directions Hypothesis takes for alternation is that earlier branches are preserved. e.g. if I do integers() | text() we’ll prefer integers. If I do text() | integers() we’ll prefer text. This generally works quite well. Conjecture’s preference for things that consume less data slightly ruins this (e.g. The integer 1 will always be preferred to the string “antidisestablishmentarianism” regardless of the order), but not to an intolerable degree, and it would be nice to preserve this property.

More generally, we don’t want a bad initial parameter draw to screw things up for us. So for example if we have just(None) | something_really_complicated() and we happen to draw a parameter which only allows the second, but it turns out this value doesn’t matter at all, we really want to be able to simplify to None.

So what we need is a parameter that shrinks in a way that makes it more permissive. The way to do this is to:

  1. Draw n bits.
  2. Invert those n bits.
  3. If the result is zero, try again.
  4. Else, return a parameter that allows all set bits.

The reason for this is that the initially drawn n bits will shrink towards zero, so as you shrink, the parameter will have more set bits.

This then presents two further problems that need solving.

The next problem is that if we pick options through choice(enabled_parameters) then this will change as we enable more things. This may sometimes work, but in general will require difficult to manage simultaneous shrinks to work well. We want to be able to shrink the parameter and the elements independently if at all possible.

So what we do is rejection sampling: We generate a random number from one to n, then if that bit is set we accept it, if not we start again. If the number of set bits is very low this can be horrendously inefficient, but we can short-circuit that problem by using the control over the distribution of bytes suggested above!

The nice thing about doing it this way is that we can mark the intermediate draws as deletable, so they get discarded and if you pay no attention to the instrumentation behind the curtain it looks like our rejection sampling magically always draws the right thing on its first draw. We can then try bytewise shrinking of the parameter, which leads to a more permissive set of options (that could then later allow us to shrink this), and the previously chosen option remains stable.

This then leads to the final problem: If we draw all the parameters up front, adding in more bits will cause us to read more data because we’ll have. This is to draw parameters for them. This is forbidden: Conjecture requires shrinks to read no more data than the example you started from (for good reason – this both helps guarantee the termination of the shrink process and keeps you in areas where shrinking is fast).

The solution here is to generate parameters lazily. When you pick alternative i, you first check if you’ve already generated a parameter for it. If you have you use that, if not you generate a new one there and then. This keeps the number and location of generated parameters relatively stable.

In writing this, a natural generalization occurred to me. It’s a little weird, but it nicely solves this problem in a way that also generates to monadic bind:

  1. parameters are generated from data.new_parameter(). All this is in an integer counter.
  2. There is a function data.parameter_value(parameter, strategy) which does the same lazy calculation keyed off the parameter ID: If we already have a parameter value for this ID and strategy, use that. If we don’t, draw a new one and store that.
  3. Before drawing from it, all strategies are interned. That is, replaced with an equivalent strategy we’ve previously seen in this test run. This means that if you have something like booleans().flatmap(lambda b: lists(just(b))), both lists(just(False)) and lists(just(True)) will be replaced with stable strategies from a pool when drawing. This means that parameters get reused.

I think this might be a good idea. It’s actually a better API, because it becomes much harder to use the wrong parameter value, and there’s no worry about leaking values or state on strategy objects, because the life cycle is fairly sharply confined to that of the test. It doesn’t solve the problem with typing this well, but it solves the problem of using it incorrectly well enough that an unsafe cast is probably fine if you’re unable to do so.

Anyway, brain dump over. I’m not sure this made sense to anyone but me, but it helped me think through the problems quite a lot.

This entry was posted in Hypothesis on by .

Services that won’t buzz off

I’ve long maintained that one of the best things about Beeminder is that it doesn’t go away just because you can’t be bothered. You can’t ignore it, and you can’t just vaguely not feel like it. You can always actively decide not to use it, but it doesn’t get you off the hook for another week, so you can’t give in to temporary weakness.

I’ve recently figured out a usage pattern for this that is working out quite well for me that comes as a direct application of this idea: By roping them to Beeminder, you can give other services the same property.

Lets be concrete:

I love Todoist. I think it’s genuinely great software. It’s well designed, has a great Android app (it has an iOS app, I assume it’s also great), has a great API, and just generally seems like they’ve put a lot of effort into it. It’s in the category of software for which I don’t really need the premium features but I pay for them anyway to support the free version (Beeminder is also somewhat in this category, though I do make use of precisely one premium feature).

But… there’s a problem. I find TODO lists mildly aversive. I don’t have crippling TODO list dread or anything (I’m not being facetious here. That’s a genuine thing people experience), but they make me a bit stressed out so I will tend to default to ignoring them if I can. It’s totally fine when I actually get around to doing them, but I will put it off nearly indefinitely if I can.

Which is where Beeminder comes in! Through a careful application of If This Then That (the only service mentioned in this post that I don’t pay for, and that’s only because they won’t let me pay them. IFTTTT, if you’re reading this, please give me a premium option?), you can rope Todoist to Beeminder’s refusal to be ignored.

So this is what I’m currently doing:

  1. I have a Beeminder goal called todone. It is a do more goal which tracks the number of TODO items I complete. I am required to complete 8 per week. The goal is set to trim the safety buffer so I can’t build up more than 8 days (the fact that these numbers are both 8 is a coincidence. They’re “slightly more than one a day” and “slightly more than a week”) of backlog, although I started this at a short safety buffer so I haven’t quite reached that point yet.
  2. I have an IFTTT rule using the Todoist and Beeminder channels that enters an item into that goal every time I complete any task.

It’s important to note what this is not: This is not a goal about being a productivity machine. 8 TODO items a week is not a large amount, particularly because many of the TODO items are recurring tasks that I would do regularly anyway. I have recurring scheduled tasks for things like “change my pillow cases” (which I always put off for a couple days more than I should) or “shave my head” (which I always intend to do more often than I do. Though given how cold it is right now maybe not). This blog post alone is netting me two TODO items because I have a recurring blogging task (every 6 days) and am now entering draft blog ideas into their own Todoist project (I heartily recommend doing that by the way, in the general interests of writing more). It does also contain more significant tasks – contact a particular client, submit a talk to a particular conference, etc. But I get to choose the mix of task difficulty, so 8 tasks a week is not hard.

The purpose of this goal is twofold:

  1. Keep me using Todoist
  2. Do not increase the stress level of using Todoist. I have previously had a more elaborate Todoist system that was more “productivity machine” focused and that was stressful as all get out and made me hate both Todoist and Beeminder. Do not recommend.

And it seems to be working rather well for this. It turns Todoist into a regular feature of my life, and it makes an excellent piece to add to my Exobrain.

This was originally designed to help me get out of various aversive behaviours, and I think the jury is still out on whether it’s succeeded at this, but it seems to be helping a bit. Even if all it does is keep me using Todoist though I think it’s an unambiguous win and I heartily recommend the combination.

This entry was posted in Better living through subservience to the machine on by .

Free work is hard work

As has previously been established, I’m a big fan of paying people for labour and think you deserve to get what you pay for when it comes to software (yes, I do run a free operating system. And yes, I get what more or less precisely what I pay for).

But… honesty compels me to admit that paying for work is no panacea. If you compare open source software to similar commercial products, the commercial ones are usually really bad too.

They’re often bad in different ways, but they’re still bad. Clearly being paid for work is not sufficient to make good software.

I think I’ve figured out a piece of this puzzle recently: When working for free, the decision making process looks very different from paid work. The result is that you can be good at things that people aren’t good at when writing commercial software, but it’s also that you just do a lot more work that you would otherwise never have bothered with.

Consider the question: “Should I do this piece of work?”

Free answer: “Well… people seem to think it’s important, and it looks a bit shit that I haven’t done it. I guess I can find some free time to work on it.”

Paid answer: “Will I earn more money from the results of doing this work than the work costs me?”

The result is that in free work, a lot of things happen that if you put your hard nosed businessperson hat on and looked at objectively make absolutely no sense.

A concrete example: Hypothesis supports Windows. This isn’t very hard, but it does create an ongoing maintenance burden.

Personally, I care not one whit for Python on Windows. I am certain that near 100% of the usage of Hypothesis on Windows is by people who also don’t care about Windows but feel bad about not supporting it. I only implemented it because “real” libraries are cross platform and I felt bad about not running on Windows.

In a business setting, the sensible solution would be  to not do the work until someone asked for it. and then either quote them for the work or charge a high enough license fee that you can soak the cost of platform support.

So what to do about this?

Well, on the open source side I think it makes sense to start being a bit more “on demand” with this stuff. I probably shouldn’t have supported Windows until I knew someone cared, and then I should have invited the person who cared to contribute time or money to it. Note: I am probably not going to follow my own advice here because I am an annoying perfectionist who is hurting the grade curve.

On the business side, I would advise that you get better software if you do sometimes do a bunch of stuff that doesn’t make immediate business sense as it rounds everything off and provides an over-all higher user experience even if any individual decision doesn’t make much sense. But honestly almost everyone writing software who is likely to read this is off in startup and adtech la la land where none of your decisions make any financial sense anyway, so that’s not very useful advice either.

So perhaps this isn’t the most actionable observation in the world, but now that I’ve noticed it’s going on I’ll be keeping a watchful eye out to observe its mechanisms and consequences.

This entry was posted in life, programming, Python on by .

Notes on headache self-care

I have a history of headaches. Not full blown migraines (usually), but long nagging low-grade ones. It’s almost guaranteed that some of this is caused by spending too much time in front of the computer, but what am I going to do? Spend less time in front of the computer? Pfft.

I have a variety of self-care strategies and was talking to a friend about them earlier. It occurred to me that they might be useful to share more broadly.

Note: I’m not a doctor, and what professional experience I do have on this subject is out of date and was never very good in the first place. These are strategies that have very much been derived from my personal experience, and I can’t promise that all of them will help you or will be good ideas for your case. Some of them may be actively bad ideas for your case.


My primary drugs of choice are ibuprofen and caffeine (usually as coffee). Caffeine is helpful even if your headache isn’t caused by caffeine withdrawal (Exceptions: Excess of caffeine can cause headaches. Also some people get caffeine induced migraines). Something something blood flow.

Ibuprofen, well, it’s a pain killer. It seems to work really well for me. My experience is that NSAIDS are very effective on me and paracetamol does nothing except for make me worry about my liver (it’s perfectly safe at sensible dosages, but I don’t like things where taking too much of them can kill you).

That being said, I take probably more ibuprofen than I should, and probably with less food in my stomach than I should. I don’t particularly recommend following my lead. Note also that if you’re doing this on a daily basis then this can cause headaches [Edit to add: This is true of all over the counter painkillers, not just NSAIDS]

Most of the time some combination of the above is enough to shake most headaches for me. When it doesn’t and I’m verging into migraine territory I like some combination of paracetamol and codeine (I know I said paracetamol doesn’t work on me, but it apparently has a synergistic reaction with codeine. Also you can’t get neat codeine, and the amount of codeine in an ibuprofen + codeine mix without that reaction is apparently basically a placebo). I like migraleve, but anything with paracetamol and codeine in it is probably a good idea. I usually deploy this on top of ibuprofen, which I am given to understand is perfectly safe.

If you are regularly getting headaches you cannot self-medicate with over the counter drugs then you should see a doctor and not consult some guy on the internet’s self care advice.

Tension headaches

Fun fact: I trained as a massage therapist.

Supplementary information for fun fact: I didn’t qualify (scheduling conflicts between exams for my massage course and my mathematics degree). Also, as a massage therapist I make an excellent software developer.

However, this does get me a bit ahead of the game in terms of self-care for tension headaches.

In theory if you’re being sensible you should try these techniques first. In practice I usually reach for the ibuprofen and either the problem goes away or is sufficiently severe that I’m in no mood to be sensible. However, these are particularly useful if you have persistent ongoing headaches that you think might be tension related. The NHS guidelines on identifying a tension headache are useful, but to be honest if you’re feeling any sort of muscular tension and experiencing a headache you might as well try these. It may not help the headache, but it will distract you for a bit, won’t hurt and will probably help the muscle tension.

Upper back and shoulders

These are really hard to self massage, because most of the things you can do to self massage will make the other side worse. I use stretches to deal with these. I used to have a “The original backnobber 2” (yes that’s its actual name), which looks like an elaborate sex toy and I found almost entirely useless. Some actual elaborate sex toys may be more useful here – e.g. the Hitachi magic wand is usually sold as a self-massage device – but I can’t say that my experience with vibration systems (from a massage chair. Stop snickering) has actually been very useful for reducing upper back and shoulder tension.

I do have some stretches that I do to help if I have tense shoulders:

  1. Lace your fingers together with your hands in front of you, palms facing outwards and push them forward as far as you can.
  2. Do the same thing but with your hands behind you and your palms facing towards your body, then bend at the waist 90 degrees so that your hands are straight up into the air and stretch backwards.
  3. Do ridiculous exaggerated shoulder rolls
  4. Twist your upper body from side to side around a vertical axis

If these don’t help, I recommend an actual back massage. Get a friend or partner to give you a back rub (I don’t currently have any good recommended tutorials for that, sorry) or go to a professional (which is advice I don’t follow enough myself).

One technique that does work for me reasonably well is as follows:

  1. With both hands at the same time, dig your fingers in between your shoulder blades and your spine (never put pressure on the spine! It’s the muscles next to it). Your elbows should be as high up as they would go.
  2. Gradually lower your elbows, pulling your fingers along with them, until they come off the top of your shoulders.

Can’t promise that one helps, but it feels nice at least.

Neck and scalp

These are fortunately much easier to self-massage.

I have three main techniques here:

  1. Temple rub: You’re probably already doing this one. Take your three main fingers and place them on your temples. Press reasonably firmly and move the skin in circles, from front to back.
  2. Scalp massage: This is much easier if you’re not me and still have hair. Grab thick bunches of it near the roots and use it to move your scalp about. Similar to the temple rub, do it with both hands at once, rotating from front to back. I usually do this once on top and once at the back of my skull near the top. If, like me, you lack hair, you can do something similar by using the base of your hands, pressing firmly into the skin and using friction to move the scalp.
  3. Massage the side of the spine from the base of the necks to where it joins your spine. Do this by placing the tips of your fingers there, digging in and pulling outwards. Then move up slightly and repeat the process until you run out of spine.

You can also do neck streches by slowly rotating your head through a head shake as far as it can go in each direction, and also by bending your neck from side to side.

These don’t always help, and they don’t usually completely get rid of a tension headache for me, but they often reduce it to a level where a previously intolerable headache becomes tolerable and much more manageable with painkillers.


Sleep is also the cause of a lot of my headaches, but the worst ones usually go away after I’ve slept. If a headache is rendering me useless, sometimes the best solution is to deploy the above solutions (sans caffeine) and go to bed early and hope it will be better in the morning. It usually is.

This entry was posted in Uncategorized on by .

My favourite language feature

I’ve been writing mostly Python and C for a while. This means that the features I’m used to are mostly Python ones, because C doesn’t really have any features so it’s hard to get used to.

And this means that when I write other languages I’m surprised because there are things that are ubiquitous in Python that I’ve forgotten aren’t ubiquitous everywhere. Most of them aren’t that important, but there was one that popped out at me that on thinking about I decided that I really liked and thought it was a shame that not every language implemented it.

That feature is named arguments with defaults.

It’s definitely not a Python specific feature – there are a bunch of other languages that have it – but Python is the language where I’ve really gotten used to how useful they are, and it’s far from ubiquitous.

I don’t really like Python’s implementation of it that much to be honest, but most of the flaws in it are easy enough to work around (and PEP 3102 makes life much better, or would if I didn’t have to write Python 2 compatible code), and when you do it makes it much easier to create extremely useful and flexible APIs. See my usage in hypothesis.strategies for example. e.g. the lists has 6 (ok, 5 really. The first one is only defaulted for legacy reasons) default options, all of which are useful and all of which play well together. Trying to do something similar with overloading or with different function names would be a complete horror, and really the best way to do it without default arguments is some sort of builder object which emulates them.

In my ideal world, I think this is how named arguments with defaults would work:

  1. There is a hard separation between named and positional arguments. The names of your arguments are not significant unless you declare them as named arguments, and named arguments cannot be passed positionally. A function may take both positional and named arguments, there’s just a clear delineation between the two. This is basically essential if you want it to be possible to make forwards compatible APIs.
  2. Named arguments are not required to have defaults.
  3. Positional arguments cannot have defaults (I’m not heart set on this, but it seems like a feature that is very much of limited utility and it’s cleaner to not have it)
  4. Defaults arguments are evaluated as expressions in the defining scope (not the calling scope) each time they are used. None of this ‘You can’t use [] as a default argument because it’s only evaluated once and then you’re sharing a mutable object’ nonsense from Python.
  5. Default arguments may not depend on other named argument values. Sorry. I know this is useful, but it messes with evaluation order in the calling scope really hard and it just doesn’t make sense as a feature.
  6. Optional: Default arguments may depend on positional argument values. This seems like an acceptable compromise for the preceding.

That’s pretty much all I have to say on the subject of named arguments with defaults: They’re great, more APIs should make extensive use of them where possible, and more languages should implement them.

Runners up

There are a bunch of other features that I think are great. Some of them made it on to my wish list but a lot didn’t for the same reason they didn’t claim top place for this post: They’ve already won. Support isn’t always ubiquitous, but it’s close enough that languages that don’t support them are weird and backwards outliers. Examples include:

  1. Namespaces with renaming on import available (If you don’t have renaming on import then you don’t have namespaces, you have implementation hiding and are still shoving APIs into the global scope).
  2. Local named and anonymous function definitions (“lambdas” or “closures”, but also nested functions).
  3. Parametrized types and functions over them (for statically typed languages).

I’d probably take any of those over named functions with default, but fortunately i mostly don’t have to.

This entry was posted in programming, Python on by .