Category Archives: Python

Delta debugging with inferred separators

Delta debugging, or something akin to it, tends to be phase one of most file based shrinkers: Shrinking is much easier the faster your test function is, test functions tend to be faster on smaller data, and shrinking is also easier on smaller data because you’ve got fewer things to try, so anything you can do to prune down the file before you get onto the hard work is good, and delta debugging like algorithms do a pretty good job of that.

But delta debugging in its original form is very line oriented: You’re trying to make the file one-minimal in the sense that you can’t delete any single line. What do you do if you want to delete within lines? Or if your file is binary and not line oriented at all?

Delta debugging really works on arbitrary sequences, so you can always just do bytewise delta debugging and it will work perfectly well, but there are usually a lot more bytes than there are lines, and deleting an arbitrary block of bytes is much less likely to work than deleting a sequence of lines is. So it works, but it takes what is often a rather slow process and makes it even slower.

I spotted a trick this morning that works rather well. Define the set of separators for a string as containing all one or two byte substrings.

Now for each separator, we perform delta debugging on the file split by that separator (i.e. we treat it as if it were the newline in our classic delta debugging: We split by separator, try to minimize that sequence subject to the constraint that joining by that separator is interesting).

On it’s own, this works but not all that well because most separators don’t work or work no better than shrinking bytewise, but there’s a refinement we can perform, which is that we can give each separator a score: The score of a separator is the smallest length in bytes of any component in the split. We then process separators in decreasing order of score.

The reason this works is that this guarantees that on a successful shrink we always delete at least the score of the separator many bytes (because we always delete at least one component). This lets us focus on separators which produce large shrinks first, so by the time we get to less useful separators we should have much less to do and already be running pretty fast.

So far I’ve only run this on one example (John Regehr’s from Reducers are Fuzzers), but it works pretty well there – much better than a straightforward bytewise delta debugging certainly, and the file has a lot of reduction potential other than newlines).

In this example at least it’s particularly interesting noticing which separators seem to work well – the obvious ones like ‘ ‘ and ‘\n’ do of course, but ‘{‘ worked particularly well for this example, as does ‘.c’ and ‘e;’. I think this is because they are by luck or reality capturing some nicely exploitable facet of the structure of the example, which suggests this idea has broader application.

I’m not sure these techniques will make it into Hypothesis, as Hypothesis has some special hinting to figure out what’s best to delete that is probably already good enough, and this seems like it would be a technique that worked better for the problem of reducing slow operations on large files, but I’m still pretty pleased with it and will probably find an opportunity to use it somewhere.

This entry was posted in Hypothesis, programming, Python on by .

Gerrymandering take two

So yesterday I wrote about how you can use Z3 to solve Gerrymandering. I was unimpressed by Z3’s performance so I decided to rewrite it with a MILP solver instead. It took me a while to figure out a good encoding (which is why I didn’t do this in the first place), but eventually I did and unsurprisingly the results were just infinitely better. Instead of struggling with 20 districts the LP solver based approach starts to take a couple of seconds when you ask it to solve 1000 districts. Based on past experiences, I’d place bets that more than 50% of that time was PulP having a hard time rather than the underlying solver.

This very much reaffirms my prejudice that despite most open source MILP solvers being abandonware with flaky tooling and despite SMT solvers being the new hotness which solve vastly more problems than MILP solvers do, SMT solvers are bullshit and should be avoided at all costs whileas MILP solvers are amazing and should be used at all opportunity.

Of course, then I was staring at the shape of the solutions and I realised that I was being ridiculous and that there is an obvious greedy algorithm that gets the correct answer 100% of the time: Starting from the smallest district and working upwards, put just enough people in that district to win it. If you don’t have enough people to win it, put the rest of your people wherever because there are no more districts you can win.

I’ve posted some updated code with all three implementations along with some tests that demonstrate they always produce the same answer (using Hypothesis, naturally)

This entry was posted in Hypothesis, Python, voting on by .

Optimal gerrymandering with Z3

Edit to add: See next post for why this was a silly thing to do.

As some of you might recall from before I spent a year writing about Hypothesis, one of my other interests is voting theory. I may be doing a talk about it at some point soon, and am reading about it.

I was thinking about Gerrymandering, and I was wondering what the best (worst) possible degrees of gerrymandering were.

For example, suppose you’ve got a population of 9 split up into 3 districts of 3. If you have 6 people out of nine who want to vote blue, it’s easy to get all three seats as blue (you put 2 in each district), and it’s easy to get only two (put all 6 of your supporters into two districts), but can you be gerrymandered to get only one? It seems implausible, but how would we prove that?

Well, as you probably guessed from the title, it turns out this is one of those things where rather than proving it by hand we can just ask a computer to do it for us! (It’s actually quite easy to prove this specific case, but I was interested in more general ones).

Here’s a Z3 program to work out the answer:

Running this, the answer is that if you have 6 people then you’re guaranteed at least two seats, but if you have 5 then you can be gerrymandered down to only one seat with the opposition claiming 2 (by your opponents splitting their four across two districts so you end up with 3, 1 and 1 in each district respectively).

It takes a list of districts with their populations and a total number of people who support a party, and tries to arrange those people in a way that maximizes the number of seats that party gets.

It only supports two party systems. It wouldn’t be too difficult to expand it to a larger number of parties though. It also doesn’t work very well for large (or even medium really) numbers of districts. It handles 20 districts in about 10 seconds, but the time seems to go up extremely nonlinearly (it handles 10 in about 20 milliseconds) and it doesn’t seem inclined to ever complete with 30 districts. This might be fixable if I understood more about performance tuning Z3.

It may be possible to improve this using the Z3 Optimize stuff, but when I tried to use Optimize I segfaulted my Python interpreter so I decided to steer clear (stuff like this is why I’ve yet to seriously consider Z3 as a backend for anything interesting unfortunately). Instead this just binary searches between a set of constraints to find the best value.

This entry was posted in Python, voting on by .

Buy me books!

A thing people sometimes ask me is how they can donate to Hypothesis. Historically the answer has been that they can’t – I’d rather people pay for services, support contracts, etc. If companies want to donate I can definitely figure something out for you, but individual donations are both not enough to make the difference between sustainability and not and also honestly I feel a little weird about it.

I probably will arrange some better funding options like this, but in general I’d much rather companies rather than individuals pay me.

But it occurred to me yesterday that a) There are a lot of books that it would be useful for me to have and b) For no obviously sensible reason, I do not feel at all weird about people buying me books. People buying me books is great.

So, to the point! I now have a public amazon wish list for people who want to say thanks for Hypothesis. You can buy me things off it if you’d like to do so. You shouldn’t feel obligated to – it’s totally fine if you can’t afford to or don’t want to – but obviously it would be super appreciated if you do.

Things to note:

  • For now at least it will be entirely books. I may add non-book things later but I’ve no real plans to.
  • I’m always up for more book recommendations to add to the list.
  • It’ll give me your name and a note you can add or not as you prefer, but not share either of our addresses with the other.
  • The general theme of the list is mostly going to be my research interests and other things I can reasonably claim are work related. These are quite Hypothesis centric, so buying me books from it will tend to have useful consequences for Hypothesis development, but no promises.
  • Everyone who buys me something will of course get a public thanks (unless you ask me not to).
  • Naturally, you’re welcome to buy me things off this list as thanks for things other than Hypothesis.

Somewhat to my surprise, so far it’s gone really well! Rob Smallshire has bought me Debug It. I’ve also been bought Combinatorial Optimization and Growing Object Oriented Software, but I won’t know by whom until they arrive (I only know this by process of elimination. Amazon doesn’t tell me anything until I get the book).

This entry was posted in Hypothesis, programming, Python, Uncategorized on by .

Augmenting shrinkers by inducing regular languages

This is an idea I’m playing with at the moment. I haven’t yet determined if it’s a good idea. My current belief is that it’s one of those ideas in the category of “Clever idea, could work, but too slow to be usable in practice”.

The problem I am trying to solve: I have some predicate \(f\) that takes a binary string and returns true or false. I am trying to find a string \(x\) such that \(f(x)\) is true and \(x\) is minimal length, and minimal lexicographically amongst the strings of minimal length (i.e. is minimal when comparing (len(x), x)). Although really we’re not actually interested in the minimal and only really care about smallish.

The traditional approach is to use a greedy algorithm. For example, one of the classic approaches in this field is delta debugging, which deletes sub-intervals to attempt to find a minimal length example (specifically it guarantees it finds an example such that you can’t delete any single character from it, though it usually does better in practice). This works and works well, but has the typical problem of greedy algorithms that it gets stuck in local minima. For example, a problem Hypothesis has in its shrinking right now is that there are often cases where a successful shrink would require both deleting one part of the string and changing the value of another part at the same time.

So there’s a question of how to get out of these local minima.

I’ve spotted one approach recently which I think might be interesting.

It starts from two key observations:

  1. Regardless of the structure of the predicate, the set of strings that satisfy the predicate and are dominated by some other string form a regular language (because they form a finite language). This regular language may in practice be prohibitively complicated, but we can always take a guess that it’s simple and bail out early if it turns out not to be.
  2. Given a regular language represented as a deterministic finite automaton, it is extremely fast to get the minimal element of it – annotate each node with its distance to a terminal node (e.g. using a Floyd-Warshall variant) then walk the graph, at each point picking the next node as the one with the smallest value, breaking ties by picking lower bytes.

So the idea is this: We use our existing shrinker to give us a corpus of positive and negative examples, then we attempt to find a regular language that is corresponds to those examples. We do this using a light variant on L* search where we offer only examples from our corpus that we’ve already calculated. As an optimization, we can also use our greedy shrinker to produce a minimized counter-example here (L* search works better the smaller your counter-examples are, and doing a lexicographic minimization keeps the number of prefixes under control).

Once we’ve done that we use the DFA we’ve built to get the minimal element of our inferred DFA. There are now three possibilities:

  1. The minimal element is also our current best example. Stop.
  2. The minimal element does not in fact satisfy the predicate. Update our L* search with a new counterexample.
  3. The minimal element does satisfy the predicate and is a strict improvement on our current best example. Start again from here.

You can also randomly walk the DFA and check if the result is interesting as another useful source of counter-examples, or indeed rerun the generator that got you the interesting example in the first place. Any way of randomly generating examples can be helpful here, although in the typical case of interest the set of true values for \(f\) is very sparse, so I expect the first form to be more useful because it has a high probability of generating something that is in our regular language but does not satisfy our predicate. I haven’t really experimented with this part yet.

At some point the number of states in the DFA will probably get prohibitively large. This can be used to terminate the shrink too. If we’ve made progress since we started the L* search we can restart the search from a new shrink, otherwise we stop there.

Does this work?

Empirically based on my prototyping the answer seems to be “Kinda”. My L* implementation appears to be quite slow in practice, but I’ve gotten a lot wrong in the course of writing it so I’m not sure that this isn’t just a bug. Additionally, there are definitely algorithmic improvements you can make to L* search. I’ve seen one that apparently improves the number of tests you have to perform from n^2 to n log(n) (n is the size of the final state machine), but I don’t understand it yet (mostly because I don’t quite get what a homing sequence is). Additionally there’s a paper I’ve yet to read on how to infer a non-deterministic finite automata instead, which I think may be more effective in practice (because converting NFAs to DFAs can cause an exponential blowup in the number of states, which I think may be happening here).

So at the moment the answer is “Ask again later”. But it’s an interesting idea that I think has some promise, so I thought I’d share my preliminary thoughts.

This entry was posted in programming, Python on by .