I’ve been meaning to check out Z3 for a while – it’s an open source theorem prover from Microsoft with great looking Python bindings. I’m partly interested just out of pure curiousity, partly because there are some interesting potential applications for Hypothesis (it’s currently unclear to me how possible those are – the architecture of Hypothesis is not well set up for concolic testing, nor are the semantics of Python, but still).

Anyway, I thought I’d turn it loose on my favourite evil problem, feedback arc set for tournaments. It’s at that weird boundary where what you want is sortof a SAT solver and sortof an integer linear programming solver. There’s some suggestion that using hybrid approaches would be better than either, but I haven’t really found any easily usable hybrid solvers (there’s a fork of minisat that might work. It’s on my TODO list)

Anyway, basically what we do with z3 is that we create the initial problem, ask it for a feasible solution, then successively ask it to improve on that feasible solution until it no longer can. Here’s the code:

The good: z3 was extremely easy to use. The API was pleasant, reasonably intuitive, and basically did what I expected.

The bad: This solution is strictly worse than a solution which just generates a massive LP file and hands it off to lp_solve. So although it was very easy to solve this problem with z3, z3 appears not to be well suited to it. Z3 struggled at n = 15 and seemed to get stuck at n = 20.

This shouldn’t necessarily be considered indicative that Z3 is “bad”. This is a problem with a *lot* of constraints (O(n^3)), so it’s probably not representative of the sort of things Z3 is good at, and it *is* representative of the sort of things that lp_solve is good at. So I’m not really deriving any lessons from the fact that Z3 wasn’t very good at this problem . I’m more interested in the fact that it was super easy to use, so I really *should* look at using it more in future.