Author Archives: david

A green build on my weighted FAS solver

A momentous occasion has just, err, occurred. I got a green build on my weighted feedback arc set solver. This has never happened before.

This might seem like a confusing concept. How can it never have passed its own test suite?

Well, the test suite has always been more aspirational than reality based. The way it effectively works is this: It has a lot of test data and it tracks the best solution found for each file, along with its score. The quality test is that the score of the solution found is within 5% of the best known solution (side-note: part of why the tests are now green is that I did raise this threshold a bit, it was originally at 1%. Most tests still hit that better threshold). A side-effect of this is that improvements to the solver will often make the tests harder to pass because they will find better solutions (and, surprisingly, this is still happening – I’ve spent enough CPU cycles and algorithmic tweaks that I keep thinking I’ve found all the optimal solutions only to be surprised by an improvement. The improvements are rarely large, but they still keep coming)

The result was that I would often make improvements to the FAS solver and then back them out because they were too slow, but leaving the test data behind, and the test would be failing as the FAS solver dreamed of the long lost days when it was cleverer.

Additionally, all the tests are capped at the fairly arbitrary maximum acceptable runtime of one minute.

At this point I should mention that finding even an approximation (I forget to within what bound – I think it’s something like half) of the optimal score for weighted FAS is an NP-hard problem, and that even calculating the score of a given instance is \(O(n^2)\). So this isn’t an easy problem to solve.

I’d largely stalled on development on it, but today I decided to go back to it and after a day of tinkering I’ve managed to make a surprising amount of progress. At the start of the day it was painfully slow on several test cases (several minutes), non-deterministic and badly failing several of its quality goals. Now it is deterministic, runs each test case in under a minute and gets within 2.5% of the best known result on all of them. For added pleasantry, the latest version has even provided the best known result for several of them, and the code is shorter and clearer than it was at the start of the day.

How does it work?

It’s pretty simple. In the end it’s ended up looking up hilariously close to the algorithm I threw together in an hour from a random paper for The Right Tool (now Hammer Principle) a few years ago.

The process is basically a mix of the following techniques:

  1. Initial scoring. This is basically a Markov chain scoring system as before. The details are slightly different, but not really in an interesting way – the idea is still to transition to better nodes and use an approximation of the stable distribution as a score
  2. What I refer to in the code as “forcing connectivity”. This is today’s clever idea, and it works really well, but it probably shouldn’t and the fact that it appears to might be a test data artifact. If you’re feeling generous it could be described as a heuristic approach, but a fairer description would perhaps be an unprincipled hack. It’s based on the observation that in sparse data sets you often end up with long chunks of ties, and these almost completely defeat local searches and make it hard to improve things. The force runs through the array one element at a time and if it’s unconnected to the next element pulls forward the next element to which it is connected (it doesn’t concern itself with ordering beyond that – the element might be strictly greater or less than it, it doesn’t care)
  3. Local sort, as before. Basically an insertion sort on the hopelessly inconsistent ordering the tournament induces. This had largely disappeared from the code – it was implicitly happening anyway, but there was no explicit local sort stage – and bringing it back in was a surprisingly big win
  4. Window optimisation. Basically, you can brute force the problem for small arrays. Window optimisation is brute forcing every subrange of some length. This is applied at various lengths in order to get mixing
  5. Single move optimisation. This is a really expensive process, but wins big enough that it’s worth it. Basically we try every possible move of a single element from one point to another and apply any that improve the score

It’s surprising how well this combination works – I’ve tried a whole bunch of more clever things, between various random search approaches, simulated annealing, block sorting (basically: build a new tournament with chunks of the existing one and recurse) and a variety of local search techniques, but they generally seemed to be significantly slower and/or less effective.

Update: The test suite is now failing again. This is partly because I’ve tightened the requirements (down to within 3% rather than 5% of best known result), partly because after some tinkering with code I’ve managed to improve the best known quality of several test cases. This is actually quite pleasant, as it shows that there are some nicely tunable parameters in here.

This entry was posted in Code, programming on by .

An algorithm for incrementally building separation graphs

You know that annoying thing when you’re reading a paper and it’s fairly clear they’ve not actually tried that algorithm in practice? Don’t you hate that?

Anyway, here’s an algorithm I’ve not tried in practice. I haven’t even written code for it yet. This post is as much to get the ideas clear in my head as anything else. This algorithm may turn out to be worse than doing it the stupid way in practice.

This is the problem we’re trying to solve:
x
We have n points, \(X = \{x_1, \ldots, x_n\}\), and a metric function d. We are trying to build the unordered graph with edges \(E = \{ (i, j) : d(x_i, x_j) > t \} \).

This is motivated by trying to find small diameter partitions of X, which we will be doing by trying to either prove (X, E) is not bipartite or finding a bipartition of it (see this previous post)

The question is how to do this efficiently? The obvious brute force approach is \(O(n^2)\), and indeed any solution must have \(O(n^2)\) worst-case. We’d like to do well in practice by applying the following two key ideas:

  1. By using the triangle inequality, we may often be able to escape having to do distance calculations because we can infer the distance will be too large / too small
  2. We are likely not to need the entire graph – a proof that a graph is not bipartite may involve only a very small subset of the nodes, so if we build the graph incrementally we may be able to stop early

The way we do this will be as follows:

For each node x, we keep track of two sets, Definite(x) and Candidates(x). We will also index distances as we find them by tracking Distances(x), which will be track all points we’ve already calculated the distance from x to.

Roles and implementation requirements

Definite(x) is a set containing all nodes we know are > t away from x and starts empty. We require efficient addition of new elements to it without introducing duplicates and the creation of an iterator which is guaranteed to iterate over all elements in the set even if new ones are added during iteration, even after it has previously claimed there are no elements left. One way to do this would be to have it consist of both an array and a hash set.

Candidates(x) contains all points which we don’t yet know for sure whether or not they are > t from x and starts equal to X. As a result we want a set implementation which is cheap to allocate an instance containing the set and cheap to delete from. This is where the previous post about integer sets comes in (we represent a node by its index).

Distances(x) is some sort of ordered map from distances to lists of points. It needs to support efficient range queries (i.e. give me everything nearer than s or further than s).

The Algorithm

Our two basic operations for manipulating the data we’re tracking are Close(x, y) and Far(x, y). Close(x, y) removes x from Candidates(y) and y from Candidates(x). Far does the same thing but also adds x to Definite(y) and y to Definite(x).

Our iterator protocol is that an iterator has an operation Next which returns either None or an Element. We will construct Neighbours(x), which creates an iterator that incrementally returns all points > t away from the x and uses information discovered whilst doing so to flesh out the information we know about other nodes too.

Here is the algorithm, written in some bizarre pseudocode language that looks like nothing I actually write (no, I don’t know why I’m using it either):

 
  Neighbours(X) = NI(x, Iterator(Definite(x)))

  Next(NI(x, found))
    
    # If we 
    while ((result = Next(found)) == None && not IsEmpty(Candidates)) do
      candidate = Pop(Candidates(x))    
      s = d(x, candidate)

      if s > t then Far(x, candidate) else Near(x, candidate)
      
      # Because d(y, candidate) <= d(y, x) + d(candidate, x) <= t - s + s = t
      for(y in Distances(x) where d(y, x) <= t - s) Near(candidate, y)

      # Because if d(y, candidate) <= t then d(y, x) <= d(candidate, x) + t <= s + t
      for(y in Distances(x) where d(y, x) > t + s) Far(candidate, y)

      # Because if d(candidate, y) <= t then d(candidate, x) <= d(candidate, x) + d(y, x) <= t - s + s = t
      if s > t then for(y in Distances(x) where d(y, x) <= t - s) Far(candidate, y)

      # If we have no more candidates left then Distances(x) will never be used again and is wasted space
      if IsEmpty(Candidates(x)) then delete Distances(x) else Put(Distances(x), s, y))
    done

    return result

The idea being that as we're searching for edges for a given node we're also using this to fill out the edges for the other nodes. This should work particularly well for doing depth first search, because in particular it often means that the nodes we're going to transition to will have some of their neighbours filled in already, and we may in fact be able to spend most of our time just chasing through Definite sets rather than having to do new distance calculations. Even where we haven't been able to do that, hopefully we've managed to prune the lists of elements to consider significantly before then.

Notes

  • It should be fairly evident that this is not only \(O(n^2)\) worst case, it's probably \(O(n^2)\) expected case for exploring the whole graph, even in reasonable metric spaces - it's likely that one or the other of the two sets of points we're looking at during each iteration step will be O(n) - so the only possible win over the brute force algorithm is if it does significantly fewer expensive distance calculations
  • I've not yet written any real code for this, so I'm sure some of the details are wrong or awkward
  • Even if it's not wrong or awkward it may well turn out to have sufficiently bad constant factors, or save few enough distance invocations, that it's worthless in practice
This entry was posted in Code, programming on by .

Integer sets with O(1) range allocation and O(log(n)) deletion

This is another post brought to you by the school of the bloody obvious. Or at least it should be, but despite having implemented a trivial variation on this data structure before it still took me far too long to remember it existed.

My previous use case was the original version of IntAllocator in the rabbitmq Java client API (it’s an internal class used for allocating channel numbers). In looking it up I’ve noticed that it no longer does, for reasons that do make sense, but those reasons don’t apply for my current use case (basically the new version uses an internal bitset, trading space for time. RabbitMQ only needs to allocate one of these, and the range isn’t that large, so trading space for time is completely acceptable).

My current use case was for building separation graphs for metric spaces. Part of an algorithm I’m playing with requires keeping a candidate list for each element: That is, a list of elements which might be > t away but we’re not sure. This needs to not take up O(n) space when the answer is “the whole set” and we need to be able to remove elements from it efficiently. Hence the constraints in the title.

The resulting structure is basically a range compressed bitset built out of a balanced binary tree. It works as follows:

A node in the tree may be either empty, an inclusive range represented by its endpoints or a split around a pivot of m such that the left hand side of the tree contains elements <= m and the right contains elements > m. We enforce the invariants that neither child of a split may be empty, and that ranges must have endpoints which admit at least one element.

Allocating an entire range is then just a matter of creating a range node holding two integers, i.e. O(1).

Deleting an element is harder, but not much.

  • Deleting an element from an empty node does nothing.
  • Deleting an element from a range has three distinct cases:
    1. If the element is not contained within the range, do nothing
    2. If the element is one of the endpoints, decrement or increment the relevant endpoint. If this results in the range being empty, replace this with an empty node
    3. If the element is internal to the range, split this range around its midpoint and delete from the split instead
  • Deleting an element from a split is even more straightforward: You determine which child the element should belong in and delete from that. If this results in that child being empty, replace this node with the other child.
  • Notes

    • I’ve put together a Haskell implementation to demonstrate the algorithm
    • There are at least two sensible ways to split a range. You can either do it around the element being deleted (which guarantees you’ll not have to recurse – you just allocate a split and two new ranges) or you can do it around the midpoint of the range. The former is cheaper but can result in the tree becoming unbalanced, the latter guarantees you’ll never need more than log(n) levels of recursion but also means most of the time you’ll use log(n) levels of recursion. You could also implement a hybrid where you split around the deleted point unless you’re below a certain depth in the tree, but I’m not sure it’s worth the added implementation cost
    • It is useful for split nodes to keep track of the smallest interval containing them as you can then shortcut if the element to be deleted lies outside that interval
    • You can also implement deleteRange efficiently (I think in O(log(n), though there’s a detail I haven’t checked there). This is left as an exercise for the interested reader
    • When I implemented IntAllocator I added a cache for this: A bounded size stack which you pushed elements to delete onto. When the stack grew full you sorted it into ranges and deleted the ranges all at once. There was a specific reason for this that doesn’t apply (sometimes you wanted to say “pop an element, any element” and that came from the stack if it was non-empty), but this might still be a useful thing to do for imperative versions of this code (it’s a bad idea for pure ones)
This entry was posted in Uncategorized on by .

Butternut squash, coconut and chickpea curry

We had our housewarming party last night, and I’d made this as food. A friend asked me for the recipe, and I was surprised it wasn’t already on here due to it being one of my standards, so I’m now fixing that.

I should warn you that this recipe is largely a lie, in that the details change every time I make it. This is the version I made last night though (well, this is a halved version of that).

Ingredients

  • 1 white onion
  • A smallish knob of fresh ginger
  • Fresh chilli to taste (depends on type of chilli and spiciness you want. Do use fresh rather than dried though)
  • One medium butternut squash
  • One can chickpeas
  • One small can of coconut cream
  • A bit of sugar
  • Quite a lot of salt
  • Quite a lot of garam massala
  • Plenty of sunflower oil

I can’t really give precise measurements on the salt and garam massala because I was adding them continuously throughout until I got the taste right. My guess is several tea spoons of salt and several table spoons of garam massala.

Cooking

  1. Dice the ginger, onion and chilli finely.
  2. Peel the squash and cut it into roughly 1cm cubes
  3. Fry the ginger, onion and chilli with salt and sugar until the onion starts to go transparent
  4. Add the squash and garam massala. Mix it all up thoroughly and fry for a little bit longer
  5. Add boiling water until the squash is nearly covered. Allow to simmer for a while, stirring occasionally
  6. When the squash is starting to go soft, add the coconut cream. Continue simmering and stirring
  7. Eventually the squash should be breaking up, causing the sauce to thicken. Once it’s starting to taste nearly cooked, add the chickpeas and continue simmering

At this point it’s basically ready to serve whenever, but it improves significantly by leaving it to cook on a low heat for longer. Also, keep tasting it whilst it’s cooking and add more salt and garam massala to taste.

This entry was posted in Uncategorized on by .

A cute algorithm for a problem you’ve never cared to solve

As part of my recent playing with metric search structures, I was wondering about the following problem: Given a metric space X I want to partition X into two sets A and B such that each of A and B has a small diameter. Equivalently, I’m trying to find A that minimizes max(diam(A), diam(\(A^c\))).

The brute force solution for this would be fairly awful: There are \(2^n\) partitions, calculating the score of each is \(O(n^2)\), so the brute force solution takes \(O(2^n n^2)\). So let’s not do that.

I was expecting that I’d have to find an approximate solution, as this looked like a classic hairy NP-hard optimization problem. I’d started contemplating local optimizations, simulated annealing, etc. Then after doing a bit more searching I happened across this paper, which in fact has a polynomial time exact solution to the problem: Diameter Partitioning by David Avis.

The algorithm is incredibly cute – it uses absolutely no advanced ideas, just two clever observations which reduce the entire problem to elementary algorithms. It’s the sort of thing which as soon as you see it you just kick yourself for not having thought of it. So I thought I’d write about it to share how much I enjoyed it, even if no one else is actually likely to care about solving this problem.

The first observation is this:

We can find a partition A, B which has max(diam(A), diam(B))) \(\leq\) t if and only if we can find two sets A, B such that if \(d(x, y) > t\) they are in different halves of the partition (this is restating the definition).

That is, if we draw an edge between x and y if and only if \(d(x, y) > t\), then a partition with max(diam(A), diam(B))) \(\leq\) t is precisely a bipartite matching for this graph.

But bipartite matchings are easy to find! Or at least, \(O(n^2)\) to find, which is no worse than calculating the diameter of a set in the first place. You just do a graph search on each of the components, marking nodes alternatively black and white (i.e. if you’re coming from a white node you mark this node black, if you’re coming from a black you mark this node white) as you find them, and yell “Impossible!” if you ever try to paint a node a different colour than you’ve already painted it. Done.

So for any given t we have an \(O(n^2)\) algorithm that either finds us a partition at least that good or tells us that none exists. Now how do we find a best possible partition?

The obvious answer is binary search, but binary searching on floats is never going to give us the exact answer, only a series of successively better approximations. This then motivates the second observation: Our distances are discrete.

If we work out the distance between every pair (we already have to do that for building the graph) then you can put these pairs + their distances in an array and sort that array. Then it’s just a simple matter of binary searching that array, and additionally you can use it to build the graph without calculating any more distances – just take the pairs to the right of your current point in the array.

Building the array is \(O(n^2 log(n))\), and we will do \(log(n)\) binary searches, each performing an \(O(n^2)\) operation, so in total the cost of this algorithm is \(O(n^2 log(n))\). A lot better than brute force.

Notes

  • It apparently is NP-hard to do this if you want a partition of more than two elements
  • Any algorithm which guarantees within a factor of two of the right answer has to examine every pair of points and thus must be \(O(n^2)\). You can see this by constructing a metric space of n points, picking a distinguished pair x, y and setting \(d(x, y) = 2\) whilst setting all other distances for 1. Any partition which separates x and y will have a score of 1, any other will have a score of 2, and by randomly picking x and y you can see that there’s no way to know this except examining every pair
  • The above example also demonstrates that it’s \(O(n^2)\) to even calculate a diameter, which means that this is also a lower bound on even knowing for sure how good your partition is
  • If you don’t care about solving this exactly and only care that the result is “good” in some fuzzy sense I think you can do a lot better in practice. There are good approximate diameter algorithms, and we didn’t take advantage of the triangle equality at all in our algorithm for building the separation graph, so I think it’s possible to adjust the partition testing code in a way that is much better than \(O(n^2)\) in many nice cases, even if it necessarily degenerates to \(O(n^2)\) in the worst cases. I’ll write more about this once I’ve worked out the details.
This entry was posted in Code, programming on by .