This is a thing that shouldn’t be hard to do and yet I manage to get it wrong every single time. I’m writing this up partly to try to save you from my fate and partly in the hope that if I write it up I will remember this next time.
Consider the following problem: You have a a matrix of booleans A. It currently stores a relation such that:
- It is reflexive: for all t, A[t, t] is true
- It is transitive: for all t, u, v, if A[t, u] and A[u, v] are both true then A[t, v] is true.
(the first condition isn’t strictly necessary for this, but it makes the whole thing easier and if you have something satisfying it but not the first you can just set A[i, i] to true for all i and have a relation satisfying both properties)
Such a relation is called a preorder. The main interesting special cases preorders are equivalence relations and partial orders. This is mostly interesting in things that look more like partial orders than equivalence relations, as equivalence relations admit a much more efficient storage format and corresponding algorithms (you just store a single array of length n labelling each element with a unique number representing its equivalence class).
Now, you pick some pair i != j such that neither A[i, j] nor A[j, i]. You want to set A[i, j] to true but maintain the above invariants.
There is a simple linear time algorithm for this. It’s efficient, easy to understand and entirely wrong, yet somehow I keep falling for it:
def wrong_update(A, i, j): n = A.shape[0] for k in xrange(n): if A[k, i]: A[k, j] = True if A[j, k]: A[i, k] = True |
Why is this wrong? It’s wrong because it misses out on pairs u, v where initially A[u, i] and A[j, v] but not A[u, v].
In fact, seeing that problem makes it relatively easy to see that there can’t possibly be an algorithm for this that does better than quadratic time.
Let n = 2k. Construct the matrix as follows: A[i, j] = (i < j) and (i < k == j < k).
That is, we’ve split the range of indices into two halves. Each half is the usual total order, but there is no relation between things in either half.
Now suppose we want to set A[k-1, k] to be true. For u < k we have A[j, k-1] and for v > k we have A[k, v]. So transitivity requires us to set A[u, v] for any u < k and v >= k. This requires us to change k^2 = n^2 / 4 = O(n^2) elements of A from false to true.
We can in fact do it in O(n^2). Consider the following algorithm:
def correct_update(A, i, j): n = A.shape[0] for u in xrange(n): for v in xrange(n): if A[u, i] and A[j, v]: A[u, v] = True |
Does this produce a transitive relation?
For convenience of analysis, instead consider the version which produces a matrix B such that B[u, v] is true if A[u, v] is true or if A[u, i] and A[j, v]. This is effectively that followed by assigning A = B (note: These are only equivalent given the result that B is transitive, so any changes we’ve written to A during iteration can’t change the result).
Lets check that B has all the desired properties.
Certainly B[t, t] is true because A[t, t] is true.
Suppose B[t, u] and B[u, v]. There are four cases to consider based on where these come from.
- If A[t, u] and A[u, v] then A[t, v] by transitivity of A. Thus B[t, v]
- If A[t, u] and not A[u, v] then we have A[t, u], A[u, i], A[j, v]. So A[t, i] and A[j, v], and thus B[t, v].
- If not A[t, u] and A[u, v] then A[t, i] and A[j, u]. So A[j, v] by transitivity and thus B[t, v] again by definition of B.
- Finally if not A[t, u] and not A[u, v] then A[t, i], A[j, u], A[u, i], A[j, v]. Therefore by transitivity of A we have A[j, i]. This contradicts our precondition that neither A[i, j] nor A[j, i], so this case is impossible.
So this works, which is nice.
Can we do better?
Well, as established, sometimes we have to do O(n^2) operations. So we can’t asymptotically do better in the worst case, but if the matrix is sparse we can do the whole thing in O(n + gh ) with an additional O(g + h) of space, where g=|{k such that A[k, i]}|, h=|{k such that A[j, k]}| by first preprocessing the list.
def faster_correct_update(A, i, j): n = A.shape[0] left = [k for k in xrange(n) if A[k, i]] right = [k for k in xrange(n) if A[j, k]] for u in left: for v in right: A[u, v] = True |
Other minor variations:
- you can do the first step in one pass if you are willing to spend the full O(n) space) by just allocating a vector of size n, writing k with A[k, i] on the left and A[j, k] on the right
- If you don’t want to spend the extra space you can do this in O(n + n min(g, h)) by just doing the earlier version, picking whichever side is smaller as the outer loop, and skip the inner loop when the outer variable doesn’t satisfy the relevant condition.
Another thing worth noting is that if A is anti-symmetric (for any u != v, not (A[u, v] and A[v, u])) then so is the resulting relation. i.e. if A encodes a partial order then so does B. Suppose B[u, v] and B[v, u]. Consider two cases:
- If one of these is from A, assume without loss of generality that A[u, v]. Then we know that not A[v, u] because A is anti-symmetric. So we must have that A[v, i] and A[j, u]. But then by transitivity we have A[u, i] and A[j, u], so A[j, i], contradicting the precondition on i and i.
- If neither of them are from A then we have A[u, i] and A[j, v], A[v, i] and A[j, u]. So transitivity again gives us A[j, i], another contradiction.
Pingback: Anatomy of a bug hunt | David R. MacIver