Coin tossing and adversary resistant priors

Hello, it’s your old friend the genie who likes to reify implausible thought experiments!

Here’s a giant pile of gold. Want it? Good. Congratulations! It’s yours!

But before you go, lets play a game.

The game works as follows: Pick a coin from the pile and toss it. If it comes up heads, I’ll give you a wish. If it comes up tails, the coin disappears in a puff of smoke.

The wishes are pretty great. There are a few provisos, but it’s still a pretty sweet deal. Certainly you could wish for more gold than is in that pile over there, but there’s other way cooler things you can wish for.

You don’t get to resume the game though. Once you decide to stop playing it you’ve stopped and all you have to show for it is however many coins you have left.

Got it? Good. Let’s begin.

You pick a coin off the pile and toss it. It comes up tails. You pick another one. Tails.

Tails, tails, tails, tails, tails.

How many coins do you toss before you stop playing?


The game was obviously always going to be rigged. The genie doesn’t care about physics so it can bias a coin if it wants to.

The only question is how rigged?

Let’s also assume that the utility value of a wish is a lot greater than the utility value of the whole pile of gold. Say 10 times greater, or 100 times. It doesn’t really matter.

The conjugate prior for a coin tossing model is some proper Beta distribution. Say \(\mathrm{Beta}(\alpha, \alpha)\).

Under this model unless we’re very careful with our choice of \(\alpha\), by choosing the probability adversarially, the Genie can force either one of two things to happen:

  1. You walk away from a perfectly reasonable chance of getting a wish.
  2. You spend all your money without ever getting a wish.

Suppose you’ve tossed \(n\) coins and still not seen a heads. Your posterior distribution is now \(\mathrm{Beta}(\alpha, \alpha + n)\). This means that your subjective probability that your next coin toss will come up heads is the expectation of this distribution, which is \(\frac{\alpha}{\alpha + n}\).

In particular if \(W\) is the utility value of a wish as a multiple of the utility value of a single coin, you will keep tossing coins as long as \(W \frac{\alpha}{\alpha + n} > 1\), or equivalently as long as \(n < \alpha (W – 1)\).

So say we want to keep \(m < n\) of our coins in this case. Then we want \(m \geq \alpha (W -1)\), or \(\alpha \leq \frac{m}{W – 1}\).

If \(W \gg n\) (the wish is much more value than the pile) then \(\frac{n}{W – 1} \ll 1\), so unless \(\alpha\) is very small you will spend all your money when given a completely rigged coin.

Now suppose we are given an unbiased coin that just happens to turn up tails the first time. If \(\alpha\) is too low we immediately go “WTF hax this game is rigged” and bail out. It would be nice to not walk away from the wish with 50% probability given a perfectly normal coin.

So suppose we don’t want to stop in our first \(k\) coins. We might choose, say, \(k = 20\) so that the probability of that happening on an unbiased coin is only about 1 in a million.

So from this we want the opposite condition. So our constraints are \(\frac{k}{W – 1} \leq \alpha \leq \frac{m}{W – 1}\).

i.e. our \(\alpha\) is entirely constrained to lie in a range defined by what we’re prepared to waste in adversarial conditions.

But this means the Bayesian approach has told us basically nothing! We might as well just pick a number of coins that we’re prepared to sacrifice up front and then just play to that point.

The problem is that we’ve chosen the prior without concern for the fact that the probability might have been adversarially chosen, so as a result it doesn’t help us much with those conditions.

It’s also more than a little bit odd to try to control our confidence in a prior based on the value of the wish and our behaviour under circumstances that the prior doesn’t really account for well at all.

So lets choose a different prior. It won’t be conjugate to the problem, but that’s OK. It’ll just make calculations slightly fiddlier.

Lets assign some non-zero probability \(q\) to the notion that the genie is just a dick and has given us a rigged game that we never win. Our prior is now a mixture of two models: Under the first, the coin never shows heads. Under the second, the behaviour is the same as before and it has a probability of heads drawn from a beta prior. The probability of the first is \(p\). Call the first event \(R\) and the second \(U\).

Under a uniform prior the probability of seeing \(k\) tails in a row is \(\frac{1}{1 + k}\). Under the rigged model it is \(1\).

After observing \(k\) tails in a row the probability of observing a head on the next throw is now \(\frac{1}{k + 1}\) times the probability that the game is not completely rigged.

An application of Bayes rule gives us that the probability that the game is not rigged is \(\frac{1 – q}{qk + 1}\), so some rearrangement gives us that we keep going for as long as \(q n^2 + (1 + q) n + 1 < W (1 – q)\).

Some blatant dropping of terms and fiddling to get a reasonable estimate means that we stop definitely before \(n = \sqrt{ W (\frac{1}{q} – 1)}\).

So say we assign a 1% probability that the genie is out to screw us. Then we want to stop when we’ve spent roughly \(10 \sqrt{W}\) coins.

This seems like a much more sensible prescription for dealing with a genie than a pure uniform model. Sometimes you’ll still end up spending all your coins, mind you: If \(W > \frac{1}{10} n^2\) (i.e. wishes are very very valuable or there aren’t that many coins), you’ll keep playing until you run out of coins, but that doesn’t seem like an unreasonable decision. In general though you’ll walk away with a significant amount of your wealth and, if there was ever a reasonable chance of you doing so, a wish too.

I don’t have much of a conclusion beyond that other than that priors should probably take into account a non-zero probability of an adversarial choice when there is a non-zero probability of an adversarial choice. I just thought the model was interesting.

@pozorvlak suggested that this might have interesting implications for the problem of adversarial examples in machine learning. I can imagine that being true – e.g. you could assign some plausibility score of an item under the original model and reject things that look too implausible earlier – but I don’t know much about the problem domain.

This entry was posted in Decision Theory on by .