Automated patch minimization for bug origin finding

Ok, so you know git bisect? You take a commit history and try to find the commit that introduced a problem.

Imagine you had the problem that git bisect solves but uh you hadn’t been very good about commit discipline and you had a whole bunch of large messy commits, several consecutive ones not in a working state, and it wasn’t really clear what happened where.

Not that I know anyone in that situation of course. This is a purely hypothetical scenario.

Well it turns out there’s a solution to this too, and it’s possibly better than git bisect.

First you write a script to detect the case you want. You can do this with git bisect too, but people rarely bother because you only need O(log(n)) tests. In this case you’ll need a lot more than that so you should do this. Running a test and grepping its output is usually sufficient here. This script exits 0 if the code exhibits the problem or non-zero otherwise.

These usually end up terrible. It’s OK – it’s genuinely a one off and you’re not going to have to maintain it. Here’s the relevant part of mine:

export PYTHONPATH=src 
if python -u -m pytest tests/cover/ -kmixed_list 2&>1 > test.log  ; then 
  echo "Test passed"
  exit 1
grep -q "Extra items in the left set" test.log
grep -q "'0'" test.log

i.e. you run the test, it should fail. Further, it’s output should contain some particular lines. Sometimes you’ll find that your script wasn’t specific enough and you’ll need to add more conditions.

Now you do what you do to start git bisect off: You find a pair of commits, one after the other, where the latter one exhibits the property you want to isolate (e.g. “This particular test fails in this particular way”)

Now, you take the patch between these two commits:

 git diff --no-prefix master andsoitbegins > working.patch

(yes, my branch is called andsoitbegins. I’m not good at branch names. Or possibly I’m excellent at branch names, you decide)

This produces a patch file. We’re now going to perform file based minimization on that patch to try to find the smallest patch that causes the problem.

I still don’t have a good go to minimizer – I tried using delta for this and continued my streak of never getting a good result from doing so, so once again I wrote my own terribly crappy file based minimizer (Update: Now I do. It’s still a bit rough around the edges but works great for me). Here you go:

One of these days I’m going to bite the bullet and just package this all up into a good piece of standalone software. Today might be that day, but it probably isn’t.

What this script does is it repeatedly replaces the contents of a file and reruns a test script to see if that matters. It does this by deleting large chunks of lines then small chunks of lines. If those produced an invalid patch that we can’t apply, it just exits early and doesn’t bother running the test.

Here’s the full test script:

This cleans up the repo, applies the current patch in working.patch and then runs the tests as described above.

In the end our starting 6kloc patch file is turned into the following:

--- src/hypothesis/searchstrategy/
@@ -204,10 +204,10 @@ class StringStrategy(MappedSearchStrategy):
     def __repr__(self):
-        return u'StringStrategy()'
+        return 'StringStrategy()'
     def pack(self, ls):
-        return u''.join(ls)
+        return ''.join(ls)

I had accidentally made the text() strategy return raw strings instead of unicode. This caused surprisingly few problems, but did break at least one unrelated test.

The first part of the patch is actually not required to reproduce the problem. What it was needed for is to locate the relevant lines – there are multiple pack() methods in this file, so apparently patch needed help to disambiguate.

In conclusion: This worked really well. It took what would have been quite a complicated piece of detective work and automated it. A++ experience, and I’ll definitely add this to my general box of tools.

This entry was posted in programming on by .