Soliciting advice: Bindings, Conjecture, and error handling

Edit: I think I have been talked into a significantly simpler system than the one described here that simply uses error codes plus some custom hooks to make this work. I’m leaving this up for posterity and am still interested in advice, but don’t worry I’ve already been talked out of using setjmp or implementing my own exception handling system.

I’m working on some rudimentary Python bindings to Conjecture and running into a bit of a problem: I’d like it to be possible to run Conjecture without forking, but I’m really struggling to come up with an error handling interface that works for this.

In Conjecture’s current design, any of the data generation functions can abort the process, and are run in a subprocess to guard against that. For testing C this makes absolute sense: It’s clean, easy to use, and there are so many things that can go wrong in a C program that will crash the process that your C testing really has to be resilient against the process crashing anyway so you might as well take advantage of that.

For Python, this is a bit sub-optimal. It would be really nice to be able to run Conjecture tests purely in process just looking for exceptions. os.fork() has to do a bunch of things which makes it much slower than just using C forking straight off (and the program behaves really weirdly when you send it a signal if you try to use the native fork function), and it’s also just a bit unneccessary for 90% of what you do with Python testing.

It would also be good to support a fork free mode so that Conjecture can eventually work on Windows (right now it’s very unixy).

Note: I don’t need forkless mode to handle crashes that are not caused by an explicit call into the conjecture API. conjecture_reject and conjecture_fail (which doesn’t exist right now but could) will explicitly abort the test, but other things that cause a crash are allowed to just crash the process in forkless mode.

So the problem is basically how to combine these interfaces, and every thing I come up with seems to be “Now we design an exception system…”

Here is the least objectionable plan I have so far. It requires a lot of drudge work on my part, but this should mostly be invisible to the end user (“Doing the drudge work so you don’t have to” is practically my motto of good library design)

Step 1: For each draw_* function in the API, add a second draw_*_checked function which has exactly the same signature. This does a setjmp, followed by a call to the underlying draw_* function. If that function aborts, it does a longjmp back to the setjmp and sets a is_aborted flag and returns some default value. Bindings must always call the _checked version of the function, then check conjecture_is_aborted() and convert it into a language appropriate error condition.

Note: It is a usage error to call one checked function from another and this will result in your crashing the process. Don’t do that. These are intended to be entry points to the API, not something that you should use in defining data generators.

Step 2: Define a “test runner” interface. This takes a test, some associated data, and runs it and returns one of three states: Passing test, failing test, rejected test. The forking based interface then becomes a single test runner. Another one using techniques similar to the checked interface is possible. Bindings libraries should write their own – e.g. a Python one would catch all exceptions and convert them into an appropriate response.

Step 3: Define a cleanup API. This lets you register a void (*cleanup)(void *data) function and some data to pass to it which may get called right before aborting. In “crash the process” model it is not required to be called, and it will not get called if your process otherwise exits abnormally. Note: This changes the memory ownership model of all data generation. Data returned to you from generators is no longer owned by you and you may not free it.

I think this satisfies the requirements of being easy to use from both C and other languages, but I’m a little worried that I’m not so much reinventing the wheel as trying to get from point A to point B without even having heard of these wheel things and so I invented the pogo stick instead. Can anyone who has more familiarity with writing C libraries designed to be usable from both C and other languages offer me some advice and/or (heh) pointers?

This entry was posted in Hypothesis, programming, Python on by .