Author Archives: david

Extending a preorder efficiently and correctly

This is a thing that shouldn’t be hard to do and yet I manage to get it wrong every single time. I’m writing this up partly to try to save you from my fate and partly in the hope that if I write it up I will remember this next time.

Consider the following problem: You have a a matrix of booleans A. It currently stores a relation such that:

  1. It is reflexive: for all t, A[t, t] is true
  2. It is transitive: for all t, u, v, if A[t, u] and A[u, v] are both true then A[t, v] is true.

(the first condition isn’t strictly necessary for this, but it makes the whole thing easier and if you have something satisfying it but not the first you can just set A[i, i] to true for all i and have a relation satisfying both properties)

Such a relation is called a preorder. The main interesting special cases preorders are equivalence relations and partial orders. This is mostly interesting in things that look more like partial orders than equivalence relations, as equivalence relations admit a much more efficient storage format and corresponding algorithms (you just store a single array of length n labelling each element with a unique number representing its equivalence class).

Now, you pick some pair i != j such that neither A[i, j] nor A[j, i]. You want to set A[i, j] to true but maintain the above invariants.

There is a simple linear time algorithm for this. It’s efficient, easy to understand and entirely wrong, yet somehow I keep falling for it:

def wrong_update(A, i, j):
   n = A.shape[0]
   for k in xrange(n):
      if A[k, i]:
         A[k, j] = True
      if A[j, k]:
         A[i, k] = True

Why is this wrong? It’s wrong because it misses out on pairs u, v where initially A[u, i] and A[j, v] but not A[u, v].

In fact, seeing that problem makes it relatively easy to see that there can’t possibly be an algorithm for this that does better than quadratic time.

Let n = 2k. Construct the matrix as follows: A[i, j] = (i < j) and (i < k == j < k).

That is, we’ve split the range of indices into two halves. Each half is the usual total order, but there is no relation between things in either half.

Now suppose we want to set A[k-1, k] to be true. For u < k we have A[j, k-1] and for v > k we have A[k, v]. So transitivity requires us to set A[u, v] for any u < k and v >= k. This requires us to change k^2 = n^2 / 4 = O(n^2) elements of A from false to true.

We can in fact do it in O(n^2). Consider the following algorithm:

def correct_update(A, i, j):
   n = A.shape[0]
   for u in xrange(n):
     for v in xrange(n):
       if A[u, i] and A[j, v]:
          A[u, v] = True

Does this produce a transitive relation?

For convenience of analysis, instead consider the version which produces a matrix B such that B[u, v] is true if A[u, v] is true or if A[u, i] and A[j, v]. This is effectively that followed by assigning A = B (note: These are only equivalent given the result that B is transitive, so any changes we’ve written to A during iteration can’t change the result).

Lets check that B has all the desired properties.

Certainly B[t, t] is true because A[t, t] is true.

Suppose B[t, u] and B[u, v]. There are four cases to consider based on where these come from.

  1. If A[t, u] and A[u, v] then A[t, v] by transitivity of A. Thus B[t, v]
  2. If A[t, u] and not A[u, v] then we have A[t, u], A[u, i], A[j, v]. So A[t, i] and A[j, v], and thus B[t, v].
  3. If not A[t, u] and A[u, v] then A[t, i] and A[j, u]. So A[j, v] by transitivity and thus B[t, v] again by definition of B.
  4. Finally if not A[t, u] and not A[u, v] then A[t, i], A[j, u], A[u, i], A[j, v]. Therefore by transitivity of A we have A[j, i]. This contradicts our precondition that neither A[i, j] nor A[j, i], so this case is impossible.

So this works, which is nice.

Can we do better?

Well, as established, sometimes we have to do O(n^2) operations. So we can’t asymptotically do better in the worst case, but if the matrix is sparse we can do the whole thing in O(n + gh ) with an additional O(g + h) of space,  where g=|{k such that A[k, i]}|, h=|{k such that A[j, k]}| by first preprocessing the list.

def faster_correct_update(A, i, j):
   n = A.shape[0]
   left = [k for k in xrange(n) if A[k, i]]
   right = [k for k in xrange(n) if A[j, k]]
   for u in left:
     for v in right:
       A[u, v] = True

Other minor variations:

  1. you can do the first step in one pass if you are willing to spend the full O(n) space) by just allocating a vector of size n, writing k with A[k, i] on the left and A[j, k] on the right
  2. If you don’t want to spend the extra space you can do this in O(n + n min(g, h)) by just doing the earlier version, picking whichever side is smaller as the outer loop, and skip the inner loop when the outer variable doesn’t satisfy the relevant condition.

Another thing worth noting is that if A is anti-symmetric (for any u != v, not (A[u, v] and A[v, u])) then so is the resulting relation. i.e. if A encodes a partial order then so does B. Suppose B[u, v] and B[v, u]. Consider two cases:

  1. If one of these is from A, assume without loss of generality that A[u, v]. Then we know that not A[v, u] because A is anti-symmetric. So we must have that A[v, i] and A[j, u]. But then by transitivity we have A[u, i] and A[j, u], so A[j, i], contradicting the precondition on i and i.
  2. If neither of them are from A then we have A[u, i] and A[j, v], A[v, i] and A[j, u]. So transitivity again gives us A[j, i], another contradiction.
This entry was posted in Uncategorized on by .

It turns out integer linear programming solvers are really good

In the comments to my last post, pozorvlak pointed out that what I was describing was awfully close to the branch and bound algorithm that an integer linear solver would be using anyway and asked if I had tried just adding integer constraints to the LP.

I did actually already know that what I was doing was a form of branch and bound. I’d come up with it independently before realising this, but alas there are no prizes for reinventing a 50 year old algorithm. The question was whether the way I was applying it was a good one or if I’d be better off just using the standard one in the ILP solver. So how do they compare?

Well, uh, there’s a funny story on that one which explains why I hadn’t tried just adding integer constraints to the LP. I actually tried it ages back, found it intolerably slow, and gave up on it as a bad idea.

Basically I only realised how effective the LP was even in large circumstances quite late because I spent the first chunk of working on this problem thinking that the LP couldn’t possibly scale to more than about 10 elements even without the integer constraints.

The reason for this is that I was using PuLP as an interface to the LP solver with its GLPK bindings. This gave me huge performance problems – it’s probably fine if you’re setting up an LP once or twice, maybe, but its performance at building the LP is terrible. It’s several orders of magnitude slower than actually running the LP.

So, I ripped out PuLP and wrote my own interface to lp_solve because it was easy enough to just generate its text format and do that (thanks to Bill Mill for the suggestion). Even that was quite slow at large n though – because the size of the LP is \(O(n^3)\) at large \(n\) you’re generating and parsing several MB to do a decent job of it. So even then the size of the LP I could hand off to it was strictly bounded.

Anyway, to cut a long story short, I’m now using the C API to lp_solve directly so setting up the model has become a lot cheaper (it’s still surprisingly slow for n bigger than around 30, but not intolerably so) and yeah it turns out that lp_solve’s integer constraints scale a lot better than my  custom tailored branch and bound algorithm does.

Moreover, after a while tweaking things and trying various clever stuff and huffily insisting that of course my approach was better and I just needed to optimise it more, I think I’ve convinced myself that the ILP branch and bound algorithm is intrinsically better here, for a reason that is both straightforward and kinda interesting.

The bit about it that is somewhat interesting is that it seems empirically that the LP that results from FAST in some sense “wants” to be 0/1 valued. The vast majority of the time the solution you get out of the relaxed LP is just integer valued automatically and you can call it a day as soon as you get the answer back from it.

The interesting thing is what happens when you don’t. You then have to branch, compute a new LP for each branch, and see what happens there. If each of those sub-problems is integral (or costs at least as much as your current best integer solution) then you’re done, else you’ll have to recurse some more.

And when this happens, my solution branches to n sub problems and the normal ILP branch and bound branches to two. If each is equally likely to produce integer valued sub-problems then the latter is obviously better.

My solution could be better if it chooses sub-problems that are substantially more likely to be integer valued. However this doesn’t seem to be the case – empirically it seems to almost always need to go at least two levels deep if it doesn’t find an integer solution immediately. In fact, it seems plausible to me that the greater freedom in choosing how to branch available to the ILP solution is more likely to let it achieve integer valued sub-problems. I don’t know if it does this, but it could just choose n/2 pairs to branch on, see if any of them produce both sides as integers and then be done if they do. At that point it would have calculated the same number of LPs as my solution but have had a substantially higher chance of not having to recurse further.

So, in conclusion, the better algorithm has won. It was a fun try in which I learned a few things but ultimately as a research avenue not very productive. The only thing from it that might be interesting to pursue further is something I actually didn’t mention in the last post, which is how to use the relaxed LP to get a good heuristic ordering, but I suspect that the quality of that ordering is directly correlated with how easy it is to just get the right answer from the ILP solver so even that is probably a dead end.

This entry was posted in Uncategorized on by .

Optimal solutions to FAST

This is another post about FAST which I mentioned in the last post and provided some heuristic solutions to. In this I’d like to talk about finding exact solutions to it.

Note: There’s a lot of literature about this and I am only starting to get to grips with it (I have a book on order which will help with that). What I’m presenting does not I think add anything very new to the state of the art except that it was a lot simpler for me to understand than anything equivalently good I could find.

My not very well optimised python version of this can handle \(20 \times 20\) matrices pretty quickly (a couple of seconds) and \(25 \times 25\) in under a minute. This is actually a pretty good achievement. I think a reasonably well written C or C++ version that wasn’t committing atrocities in how it communicated with the LP solver could easily be an order of magnitude and possibly two faster, which might bring things up to about \(30 \times 30\) within feasible reach.

Also, if you want to see code instead of maths, you can just skip straight to the repo where I’ve implemented this solution (plus a few details not mentioned here).

There are a couple ideas in this solution:

Firstly, note that we only need to find the weight of the optimal solution. If we can do this then we can reconstruct an optimal solution with only an additional \(O(n^2)\) additional calls (and because of the way this solution will use dynamic programming this will actually be a lot faster than it sounds): We can find the head of the solution by looking at the weights of optimal solutions for each of the subsets of size \(n – 1\) and then recursively find an ordering for the best subset. So specifically we want the function \(\mathrm{FasWeight}(A) = \min\limits_{\sigma \in S_n} w(A, \sigma)\).

From henceforth, let \(A\) be fixed. We can also assume without loss of generality that \(A\) is non-negative and integer valued, so we will because it makes our life easier.

The starting point for our solution will be the following recursive brute force implementation of the function \(\mathrm{FasWeightSubset}(U)\) where \(U \subseteq N = \{1, \ldots, n\}\).

If \(|U| \leq 1\) then \(\mathrm{FasWeightSubset}(U) = 0\)

Else \(\mathrm{FasWeightSubset}(U) = \min\limits_{x \in U} \left[ \mathrm{FasWeightSubset}(U – \{x\}) +\mathrm{HeadScore}(x, U) \right]\)

Where \(\mathrm{HeadScore}(x, U) = \sum\limits_{y \in U, y \neq x} A_{yx}\).

What we’re doing here is conditioning on each element of \(x\) whether it’s at the front of the list. If it is then the optimal score is the optimal score for ordering \(U\) without \(x\) plus the cost of all the backward arcs to \(x\). We take the minimum over all \(x\) and get the desired result.

If one uses dynamic programming to memoize the result for each subset (which you really have to do in order for it to be remotely feasible to run this) then this is an \(O(n^2 2^n)\) algorithm. It also uses rather a lot of memory (\(\sum\limits_{k \leq n} k {n \choose k}\), which is probably some nice formula but basically is “lots”).

The idea for improving this is that if we have some fast way of calculating a lower bound for how good a set could possibly be then we may be able to skip some subsets without actually doing the full calculation. e.g. a trivial lower bound is \(0\). If a subset would need a negative score to improve the current best then we can disregard it. A slightly less trivial lower bound is \(\sum\limits_{x, y \in U, x < y} \min(A_{xy}, A_{yx})\), because for each pair of indices you  must pay the cost of an arc in at least one direction. It turns out that neither of these scores are good enough, but we’ll skip the question of what is. For now assume we have a function \(\mathrm{LowerBound}(U)\) which provides some number such that every ordering of \(U\) creates weight \(\geq \mathrm{LowerBound}(U)\).

We can now use this to define the  function \(\mathrm{CappedFasWeightSubset}(U, t)\). This will be equal to \(\min(\mathrm{FasWeightSubset}(U), t)\) but we can offer a more efficient algorithm for it:

For \(U\) with \(\mathrm{LowerBound}(U) \geq t\), let \(\mathrm{CappedFasWeightSubset}(U, t) = t\).

Otherwise, \(\mathrm{CappedFasWeightSubset}(U, t) = \min\limits_{x \in U} \left[ \mathrm{CappedFasWeightSubset}(U – \{x\}, t – \mathrm{HeadScore}(x, U)) + \mathrm{HeadScore}(x, U) \right]\)

At each recursive call, placing \(x\) at the head incurs a certain cost and thus lowers the threshold we pass to the tail.

In fact we can do better than this: Given any known good solution lower to that threshold we can lower the threshold to that value, as we’re only interested in whether we can improve on that. This also means that we can use the running minimum of the sub-problems to reduce the threshold. I won’t write that in the current notation because it’s a pain. It’ll be clearer in the code though.

This is basically it. There are a few minor optimisations, and how to memoize the results may not be obvious (you don’t want the threshold in the cache key or you’ll get zero hits), but there’s not much more too it than that.

However: This works about as badly as the earlier version if your lower bound isn’t good enough – even with an only reasonably good one I found I was only cutting out maybe 70% of the subsets, which was nice but not actually worth the cost of calculating the lower bound.

So where do we get better lower bounds?

Well it turns out that there’s another formulation of the problem. You can treat this as an instance of integer linear programming. It requires \(O(n^2)\) variables and \(O(n^3)\) constraints, so the integer linear programming problem rapidly becomes too difficult to calculate, but the convex relaxation of it (i.e. the same linear program without the integer constraints) provides a good lower bound.

The way this works is that we introduce 0/1 \(D_{ij}\) variables for each pair \(i \neq j\). \(D_{ij}\) will define a total ordering \(\prec\) of \(N\), with \(D_{ij} = 1\) if \(i \prec j\). This gives us the following ILP:

Minimize: \(\sum\limits_{i \neq j} A_{ji} D_{ij}\) (you pay the cost of the backwards arcs)

Subject to: \(\forall i, j, D_{ij} + D_{ji} + 1\) (the order is reflexive)

And: \(\forall i, j, k, D_{ij} + D_{jk} + D_{ki} \leq 2\) (the order is transitive)

Solving this linear program without the integer constraints then gives us a fairly good lower bound.

What’s more, it turns out that often the integer constraints happen to be satisfied anyway. This allows us a further shortcut: If the lower bound we found turns out to have all integer variables, we can just immediately return that as the solution.

This entry was posted in Uncategorized on by .

Using voting systems for combinatorial optimisation problems

Happy new year everyone!

I thought I would cope with my New Year’s hangover through the traditional method: Ibuprofen, coffee and attempting to solve NP hard combinatorial optimisation problems.

(The fact that I am now free to work on these things as my contract with Google has officially expired may have something to do with why I was looking into this today rather than waiting until I was less hungover)

As some of you might be aware I have a perverse interest in the Feedback Arc Set for Tournaments (FAST) problem. This is as follows:

Given an \(n \times n\) matrix \(A\), find an ordering of \(\{1, \ldots, n\}\), say \(\sigma_1, \ldots, \sigma_n\), which minimizes the sum of backwards weights \(w(A, \sigma) = \sum_{i < j} A_{\sigma_j,\sigma_i}\). This sum can be regarded as in a sense the degree to which the ordering gets the pairwise preferences of \(A\) “wrong”.

The classic example of why this is interesting is the Kemeny Young voting method. There are others (I think it’s useful for graph drawing for example) but I don’t know much about them.

Anyway, solving this is super NP hard. We’re not talking “it’s about as hard as the travelling salesman”, we’re talking “You want to solve this optimally for a hundred nodes? Ha ha. Good luck”. As a result one tends to use heuristic orderings for it.

I was wondering whether you could related this back to voting to get a good heuristic order and it turns out you can, and a really rather good one to boot. So this is a report on an experiment I’ve run which uses the Ranked Pairs voting algorithm to give an \(O(n^3)\) heuristic ordering for FAST that appears to produce quite a high quality result.

You can see all the code for this work here.

Evaluation

The metric I am using to evaluate methods is as follows: Let \(A\) be a matrix valued random variable and let \(f\) and \(g\) be functions which take an \(n \times n\) matrix and return a random permutation of \(\{1, \ldots, n\}\). Then \(f\) is better than \(g\) with respect to \(A\) if \(E(f(A)) < E(g(A))\).

Note that this notion of betterness depends very much on the distribution of \(A\). \(f\) is better for \(g\) for every choice of distribution only if for every fixed \(x\), \(E(f(x)) < E(g(x))\), which is a much stronger and hard to test condition.

I focus on one particular choice of \(A\): \(100 \times 100\) matrices are generated such that if \(i < j\) we pick \(A_{ij}\) to be any integer between 0 and 100. If \(i > j\) we pick it to be any integer between 0 and 70. We then shuffle the coordinates so as to prevent systems that are biased towards picking the coordinates in order from having an advantage. This gives us a matrix which has a rough “direction” but is really quite noisy in how well it establishes that direction.

I then generate 100 matrices and compare two methods by using a monte carlo permutation test for the difference of the means when applied to these matrices. Note: I am aware this is a somewhat dubious decision due to the potential difference in the standard deviations. This step could use improving.

Because I am comparing more than two different methods there is a multiple testing problem. I use the Benjamini-Hochberg procedure to control the false discovery rate to 1%. I think this is valid because of the nature of the correlations between the tests (false discoveries should all be positively correlated) but I’m not really sure.

As a calibration I tested adding duplicates of some methods to see if this would confuse the results and it did not. All the existing results remained statistically significant and no false conclusions were drawn about the relations amongst the duplicates.

Basically: The statistics here are kinda dodgy and I don’t really know enough to know how dodgy they are. There’s a reason working on my statistics is one of my plans for the next few months. However the results are all sufficiently convincing that I’m not too worried about the conclusions being wrong even if the methods are.

Methods compared

I compared six different methods, with varying expectations of some of them being any good (actually I compared more than this, but I threw some of them away because they were not any better and were annoyingly slow). These were:

  1. Generate a random permutation
  2. Generate a random permutation then apply a local kemenization procedure (basically: swapping adjacent elements in the order when this will improve the score). Reverse this and apply the local kemenization procedure again. Pick the best of these two.
  3. Sort by the assigning a variable \(i\) the score \(\sum A_{ji}\) (this is basically the weight of evidence that various things are less than \(i\)\) and again apply the local kemenization procedure.
  4. A greedy algorithm that builds up the list one element at a time by iterating over the indices in a random order and inserting each element in the optimal location in the already built list.
  5. Kwik-sort, which is essentially applying quick sort with a random pivot to the order defined by the matrix and ignoring that that order isn’t actually transitive. This is as defined in this paper. I have not tried their proposed LP version of kwik sort because running the LP on this size of problem turns out to be really slow and I got bored of waiting for it.
  6. Ranked pairs. This works by building a transitive order over the elements. We iterate over the pairs of indices in decreasing order of weight (i.e. largest first). If the order of this pair has not been established already we “lock it in” in this order and add all of the other pairs that this + transitivity requires.

Results

Here is the program output:

 Simple shuffle: 210202.81 +/- 3381.47. 1%=204032.12, 50%=209934.00, 99%=218417.48
 Improved shuffle: 204489.03 +/- 2225.88. 1%=199087.83, 50%=204575.50, 99%=209383.01
 Smoothed indegree: 178446.51 +/- 1699.39. 1%=174738.72, 50%=178306.50, 99%=183069.17
 Insertion order: 175407.30 +/- 1965.16. 1%=171059.98, 50%=175226.50, 99%=179597.99
 Kwik Sort: 195279.10 +/- 2923.19. 1%=189704.31, 50%=195339.50, 99%=201568.30
 Ranked pairs: 172922.71 +/- 1431.80. 1%=170204.94, 50%=172847.00, 99%=177180.83<

 Pairwise comparisons:
 Improved shuffle < Simple shuffle (p=0.0020)
 Insertion order < Improved shuffle (p=0.0020)
 Insertion order < Kwik Sort (p=0.0020)
 Insertion order < Simple shuffle (p=0.0020)
 Insertion order < Smoothed indegree (p=0.0020)
 Kwik Sort < Improved shuffle (p=0.0020)
 Kwik Sort < Simple shuffle (p=0.0020)
 Ranked pairs < Improved shuffle (p=0.0020)
 Ranked pairs < Insertion order (p=0.0020)
 Ranked pairs < Kwik Sort (p=0.0020)
 Ranked pairs < Simple shuffle (p=0.0020)
 Ranked pairs < Smoothed indegree (p=0.0020)
 Smoothed indegree < Improved shuffle (p=0.0020)
 Smoothed indegree < Kwik Sort (p=0.0020)
 Smoothed indegree < Simple shuffle (p=0.0020)

(The reason they all have the same p value is that that’s what you get when you never see a result that extreme in 500 random draws in the permutation test)

Ranked Pairs seems to comfortably beat out all the other heuristics by some margin, and the permutation tests agree that this is statistically significant.

As a side note, I wish I’d done this comparison a while ago, because I’m quite surprised to find out how badly kwik sort does. This may be an artifact of the shape of the data. I’m not sure.

I also ran a smaller experiment to compare Ranked Pairs with Schulze voting. It had to be smaller because Schulze voting was too slow to run on the full set of 100 matrices, so the following is a comparison between Schulze and Ranked Pairs on 10 matrices:

Schulze: 176430.30 +/- 1391.75. 1%=174276.96, 50%=176359.00, 99%=178563.24
Simple shuffle: 212209.30 +/- 3389.81. 1%=208202.43, 50%=211704.00, 99%=219189.76
Ranked pairs: 172918.60 +/- 1701.93. 1%=170901.82, 50%=172503.00, 99%=175619.76

Pairwise comparisons:
Ranked pairs < Schulze (p=0.0040)
Ranked pairs < Simple shuffle (p=0.0020)
Schulze < Simple shuffle (p=0.0020)

As Schulze appears to be both dramatically slower and a fair bit worse for this data set, it doesn’t seem worth pursuing as an alternative.

This entry was posted in Uncategorized on by .

A genie brokers a deal

Content note: This piece contains some fairly non-graphic discussion of death and serious injury.

This is effectively a trolley problem inspired by my last post. It’s not directly in support of the same point and is perhaps slightly orthogonal to it.


 

You wake up, slightly disoriented. You’re not really sure where you are or how you got there, but you have this weird metal band firmly fastened around the elbow of your non-dominant arm.

A genie appears. You can tell that it’s a genie because it’s bright blue and has smoke instead of legs and that’s what genies look like. Your situation now makes more sense. Or possibly less.

“Hello”, says the genie. “I’m here to offer you a deal. You see that band on your arm? At a thought, I can make that band constrict and cut off your arm. Don’t worry it comes with anaesthetic and a thing for sealing the wound, so it won’t hurt at all and there’s no risk of complications or injury. You’ll just lose the arm.”

This is not entirely reassuring.

“But it’s OK! Once I’ve finished explaining the situation to you you’ll have the option to just leave. I’ll take the band off, no harm will come to you, and I’ll return you home just as you were.”

This is a little more reassuring.

“But I’d like to know: What if I were to offer you a deal? You can have $5 million in exchange for letting me contract the band and cut off your arm. What do you say?”

You have a think about it. $5 million could buy you a lot – you could invest it and live relatively comfortably off the interest alone. It would let you do all sorts of things with your life that you’ve always wanted to and couldn’t. Sure, your arm is useful, but it’s not that useful, and a small fraction of $5 million will buy you some pretty good prosthetics. You say you would take the deal.

“Good to know. And what if I were to offer you $1 million?”

This seems like less of a good deal but you acknowledge that you would probably still take it.

“500k?”

This is small enough that you think on balance it’s probably not worth the loss of the arm.

“How about 250k?”

You indicate that your preferences over quantities of money are monotonic increasing in the numerical value, and that as a good rational being you have transitive preferences over events, so decreasing the amount of money is not going to make this more favourable. You do your best to not sound patronising while explaining this to the all powerful being who is offering to cut off your arm.

“Fine, fine. Just checking. Humans are so irrational, I wanted to be sure. Now, lets talk about something else”

You’re still quite interested in the subject of $5 million dollars and whether you get to keep your arm, but figure you can go along with the change of subject for now.

The genie waves his hand and an image appears in the air. It is of the genie and another person. The person in question has a band much like yours, but this one is around their neck.

The genie explains. “When I found this person they were about to be hit by a train. Would have squashed them flat. So I rescued them! Wasn’t that nice of me?”

You decide to reserve judgement until you hear more about the band.

“Now, watch this conversation I had with them!”

The image starts to move and sound comes from it. You’d be very impressed by the genie’s powers if you hadn’t previously watched television with much better resolution.

“Hello! I’d like to offer you a deal.”

“Holy shit a genie”

“Yes, yes. Well observed. Now, about this deal. You notice that band you’ve got around your neck? Unless I take it off you it will constrict and cut your head off. Now I’m not that familiar with human physiology, but I believe this will be fatal”

“Aaaaaaaaaaaaah. Get it off! Get it off!”

“But I want something in exchange for removing it. I’m going to take all your money. Does that seem fair?”

“Yes, anything! I just don’t want to die!”

“Are you sure? You’ve got that will and all…”

“I don’t care about the will! Just take off the band! Please!”

“Hmm. I’ll get back to you on that”

The genie in the image disappears and the person remains behind, sobbing pitifully. Back with you, the genie waves his hand and the image freezes again.

“Now”,  the genie explains, “By a complete coincidence all of this person’s money is almost exactly $5 million. So here’s the deal I’m offering you: One of these bands is going to close. It’s up to you which one. It is also up to you how much of their money you wish to take. You can either choose to walk way from this and I’ll kill them, or you can choose to lose your arm and take any amount of money from them up to $5 million and I’ll arrange that too. Your call. What will it be?”


Now, questions for the reader:

  1. Are you morally obligated to give up your arm?
  2. If you do give up your arm, how much money is it ethically permissible for you to take?
  3. Regardless of what you should answer, would your answer change if you knew that the details of the deal would be publicly broadcast?
  4. Has the genie acted ethically? Note that every possible outcome regardless of what you choose is no worse than the status quo before the genie’s intervention and may be better.
This entry was posted in Performing philosophy without a license, Uncategorized on by .