# Coin tossing and adversary resistant priors

Hello, it’s your old friend the genie who likes to reify implausible thought experiments!

Here’s a giant pile of gold. Want it? Good. Congratulations! It’s yours!

But before you go, lets play a game.

The game works as follows: Pick a coin from the pile and toss it. If it comes up heads, I’ll give you a wish. If it comes up tails, the coin disappears in a puff of smoke.

The wishes are pretty great. There are a few provisos, but it’s still a pretty sweet deal. Certainly you could wish for more gold than is in that pile over there, but there’s other way cooler things you can wish for.

You don’t get to resume the game though. Once you decide to stop playing it you’ve stopped and all you have to show for it is however many coins you have left.

Got it? Good. Let’s begin.

You pick a coin off the pile and toss it. It comes up tails. You pick another one. Tails.

Tails, tails, tails, tails, tails.

How many coins do you toss before you stop playing?

The game was obviously always going to be rigged. The genie doesn’t care about physics so it can bias a coin if it wants to.

The only question is how rigged?

Let’s also assume that the utility value of a wish is a lot greater than the utility value of the whole pile of gold. Say 10 times greater, or 100 times. It doesn’t really matter.

The conjugate prior for a coin tossing model is some proper Beta distribution. Say $$\mathrm{Beta}(\alpha, \alpha)$$.

Under this model unless we’re very careful with our choice of $$\alpha$$, by choosing the probability adversarially, the Genie can force either one of two things to happen:

1. You walk away from a perfectly reasonable chance of getting a wish.
2. You spend all your money without ever getting a wish.

Suppose you’ve tossed $$n$$ coins and still not seen a heads. Your posterior distribution is now $$\mathrm{Beta}(\alpha, \alpha + n)$$. This means that your subjective probability that your next coin toss will come up heads is the expectation of this distribution, which is $$\frac{\alpha}{\alpha + n}$$.

In particular if $$W$$ is the utility value of a wish as a multiple of the utility value of a single coin, you will keep tossing coins as long as $$W \frac{\alpha}{\alpha + n} > 1$$, or equivalently as long as $$n < \alpha (W – 1)$$.

So say we want to keep $$m < n$$ of our coins in this case. Then we want $$m \geq \alpha (W -1)$$, or $$\alpha \leq \frac{m}{W – 1}$$.

If $$W \gg n$$ (the wish is much more value than the pile) then $$\frac{n}{W – 1} \ll 1$$, so unless $$\alpha$$ is very small you will spend all your money when given a completely rigged coin.

Now suppose we are given an unbiased coin that just happens to turn up tails the first time. If $$\alpha$$ is too low we immediately go “WTF hax this game is rigged” and bail out. It would be nice to not walk away from the wish with 50% probability given a perfectly normal coin.

So suppose we don’t want to stop in our first $$k$$ coins. We might choose, say, $$k = 20$$ so that the probability of that happening on an unbiased coin is only about 1 in a million.

So from this we want the opposite condition. So our constraints are $$\frac{k}{W – 1} \leq \alpha \leq \frac{m}{W – 1}$$.

i.e. our $$\alpha$$ is entirely constrained to lie in a range defined by what we’re prepared to waste in adversarial conditions.

But this means the Bayesian approach has told us basically nothing! We might as well just pick a number of coins that we’re prepared to sacrifice up front and then just play to that point.

The problem is that we’ve chosen the prior without concern for the fact that the probability might have been adversarially chosen, so as a result it doesn’t help us much with those conditions.

It’s also more than a little bit odd to try to control our confidence in a prior based on the value of the wish and our behaviour under circumstances that the prior doesn’t really account for well at all.

So lets choose a different prior. It won’t be conjugate to the problem, but that’s OK. It’ll just make calculations slightly fiddlier.

Lets assign some non-zero probability $$q$$ to the notion that the genie is just a dick and has given us a rigged game that we never win. Our prior is now a mixture of two models: Under the first, the coin never shows heads. Under the second, the behaviour is the same as before and it has a probability of heads drawn from a beta prior. The probability of the first is $$p$$. Call the first event $$R$$ and the second $$U$$.

Under a uniform prior the probability of seeing $$k$$ tails in a row is $$\frac{1}{1 + k}$$. Under the rigged model it is $$1$$.

After observing $$k$$ tails in a row the probability of observing a head on the next throw is now $$\frac{1}{k + 1}$$ times the probability that the game is not completely rigged.

An application of Bayes rule gives us that the probability that the game is not rigged is $$\frac{1 – q}{qk + 1}$$, so some rearrangement gives us that we keep going for as long as $$q n^2 + (1 + q) n + 1 < W (1 – q)$$.

Some blatant dropping of terms and fiddling to get a reasonable estimate means that we stop definitely before $$n = \sqrt{ W (\frac{1}{q} – 1)}$$.

So say we assign a 1% probability that the genie is out to screw us. Then we want to stop when we’ve spent roughly $$10 \sqrt{W}$$ coins.

This seems like a much more sensible prescription for dealing with a genie than a pure uniform model. Sometimes you’ll still end up spending all your coins, mind you: If $$W > \frac{1}{10} n^2$$ (i.e. wishes are very very valuable or there aren’t that many coins), you’ll keep playing until you run out of coins, but that doesn’t seem like an unreasonable decision. In general though you’ll walk away with a significant amount of your wealth and, if there was ever a reasonable chance of you doing so, a wish too.

I don’t have much of a conclusion beyond that other than that priors should probably take into account a non-zero probability of an adversarial choice when there is a non-zero probability of an adversarial choice. I just thought the model was interesting.

@pozorvlak suggested that this might have interesting implications for the problem of adversarial examples in machine learning. I can imagine that being true – e.g. you could assign some plausibility score of an item under the original model and reject things that look too implausible earlier – but I don’t know much about the problem domain.

This entry was posted in Decision Theory on by .

# Bayesian reasoners shouldn’t believe logical arguments

Advance warning: If you’re familiar with Bayesian reasoning it is unlikely this post contains anything new unless I’m making novel mistakes, which is totally possible.

Let me present you with a hypothetical, abstracted, argument:

Me: C
You: Not C!
Me: B?
You: *shrug*
Me: A?
You: Yes
Me: A implies B?
You: Yes?
Me: B implies C?
You: … Yes
Me: Therefore C?
You: C. :-(

Does this seem like a likely scenario to you? We have had a disagreement. I have presented a logical argument from shared premises for my side of the disagreement. You have accepted that argument and changed your position.

Yeah, it sounds pretty implausible to me too. A more likely response from you at the end is:

You: Not C!

I will of course find this highly irrational and be irritated by your response.

…unless you’re a Bayesian reasoner, in which case you are behaving entirely correctly, and I’ll give you a free pass.

Wait, what?

Suppose you have propositions $$A$$ and $$B$$, which you believe with probabilities $$a$$ and $$b$$ respectively. You currently believe these two to be independent, so $$P(A \wedge B) = ab$$

Now, suppose I come along and convince you that $$A \implies B$$ is true (I’ll call this proposition $$I$$). What is your new probability for k$$B$$?

Well, by Bayes rule, $$P(B|I) = \frac{P(B \wedge I)}{P(I)} = P(B) \frac{P(I|B)}{P(I)}$$

$$I = A \implies B = \neg\left( A \wedge \neg B\right)$$. So $$P(I) = 1 – a(1 – b)$$.

$$P(I|B) = 1$$ because everything implies a true proposition. Therefore $$P(B|I) = \frac{b}{(1 – a(1 – b))}$$.

This is a slightly gross formula. Note however it does have the obviously desirable property that your believe in B goes up, or at least stays the same. Lets quickly check it with some numbers.

$$a$$ $$b$$ $$P(B | I)$$
0.100 0.100 0.110
0.100 0.500 0.526
0.100 0.900 0.909
0.500 0.100 0.182
0.500 0.500 0.667
0.500 0.900 0.947
0.900 0.100 0.526
0.900 0.500 0.909
0.900 0.900 0.989

These look pretty plausible. Our beliefs do not seem to change to an unrealistic degree, but we have provided significant evidence in favour of $$B$$.

But as a good Bayesian reasoner, you shouldn’t assign probabilities 0 or 1 to things. Certainty is poisonous to good probability updates. So when I came along and convinced you that $$A \implies B$$, you really shouldn’t have believed me completely. Instead you should have assigned some probability $$r$$ to it. So what happens now?

Well we know what the probability of $$B$$ given $$I$$ is, but what is the probability given $$\neg I$$? Well $$\neg I = \neg (A \implies B) = A \wedge \neg B$$, so $$P(B|\neg I) = 0$$. The implication can only be false if $$B$$ is (because everything implies a true statement).

This means that your posterior probability for $$B$$ should be $$r P(B|I)$$. So $$r$$ is essentially a factor slowing your update process.

Note that because my posterior belief in B is $$b \frac{r}{P(I)}$$, as long as my claim that $$A \implies B$$ is at least as convincing as my prior belief in it, my argument will increase your belief in it.

Now. Lets suppose that you are in fact entirely convinced before hand that $$A$$ and that $$\neg B$$, and my argument entirely convinces you that $$A \implies B$$.

Of course, we don’t believe in certainty. Things you are entirely convinced of may prove to be false. Suppose now that in the past you have noticed that when you’re entirely convinced of something, you’re right with about probability $$p$$. Lets be over-optimistic and say that $$p$$ is somewhere in the 0.9 range.

What should your posterior probability for $$B$$ now be? We have $$b = 1 – p$$ and $$a = r = p$$. Then your posterior probability for $$B$$ is $$r P(B | I) = p \frac{1 – p}{(1 – p(1 – (1 – p)))} = p \frac{1 – p}{1 – p^2} = \frac{p}{p+1} = 1 – \frac{1}{p+1}$$.

You know what the interesting thing about this is? The interesting thing is that it’s always less than half. A perfectly convincing argument that a thing I completely believe in implies a thing I completely disbelieve in should never do more than create a state of complete uncertainty in your mind.

It turns out that reasonable degrees of certainty get pretty close to that too. If you’re right about things you’re certain about with probability 0.9 then your posterior probability for $$B$$ should be 0.47. If you’re only right with probability 0.7 then it should be $$0.41$$. Of course, if you’re only right about that often then $$0.41$$ isn’t all that far from your threshold for certainty in the negative result.

In conclusion: If you believe A and not B, and I convince you that A implies B, you should not now go away and believe B. Instead you should be confused, with a bias towards still assuming not B, until you’ve resolved this.

Now, lets go one step further to our original example. We are instead arguing about $$C$$, and my argument proceeds via an intermediary $$B$$. Your prior is that $$A$$, $$B$$ and $$C$$ are all independent. You are certain that $$A$$, certain that $$\neg C$$ and have no opinion on $$B$$ (i.e. you believe it with probability $$\frac{1}{2}$$.

I now provide you with a p-convincing argument that $$A \implies B$$. What is your posterior probability for $$B$$?

Well, plugging it into our previous we get $$b’ = p \frac{b}{1 – p(1 – b)} = \frac{p}{2 – p}$$. Again, checking against some numbers, if $$p = 0.9$$ then $$b’ \approx 0.82$$, which seems reasonable.

Suppose now that I provide you p-convincing evidence that $$B \implies C$$. What’s your posterior for $$C$$?

Well, again with the previous formula only replacing $$a$$ with $$b’$$ and $$b$$ with $$c$$ we have

\begin{align*} c’ &= \frac{p c}{1 – b'(1 – c)} \\ &= \frac{p(1-p)}{1 – \frac{p^2}{2 – p}} \\ &= \frac{p(1-p)(2 – p)}{2 – p – p^2}\\ \end{align*}

this isn’t a nice formula, but we can plug numbers in. Suppose your certainties are 0.9. Then your posterior is $$c’ \approx 0.34$$. You’re no longer certain that $$C$$ is false, but you’re still pretty convinced despite the fact that I’ve just presented you with an apparently water-tight argument to the contrary. This result is pretty robust with respect too your degree of certainty, too. As $$p \to 1$$, this seems to tend to $$\frac{1}{3}$$, and for $$p = \frac{1}{2}$$ (i.e. you’re wrong half the time when you’re certain!) we get $$c = 0.3$$.

In conclusion: An apparently water tight logical argument that goes from a single premise you believe in to a premise you disbelieve in via something you have no opinion on should not substantially update your beliefs, even if it casts some doubt on them.

Of course, if you’re a Bayesian reasoner, this post is an argument that starts from a premise you believe in, goes via something you have no opinion on, and concludes something you likely don’t believe in. Therefore it shouldn’t change your beliefs very much.

This entry was posted in Decision Theory, Numbers are hard on by .

# Objections to the VNM utility theorem, part 2

This is another post about reasons why I don’t agree with the VNM theorem. I’m still focusing on the actual axioms of it rather than taking it in the broader context that I object to it because, well, frankly it’s much easier.

Here I would like to object to the idea that preferences over lotteries should be complete. That is, for any pair of lotteries $$A, B$$ you should be able to say that $$A \leq B$$ or $$B \leq A$$. I’m not even going to use lotteries to do it, I’m going to use concrete events.

The point I would like to make is that there are multiple relevant notions of indifference between outcomes. One is “I genuinely don’t care about the difference between these outcomes” and another is “I do not have sufficient information to decide which of these outcomes I prefer”. Although your observed behaviour in these cases may be the same (pick an option at random), they differ in important ways. In particular if you treat them as the same you lose transitivity of preference.

Let me pick a worked example. Suppose you’re running a school, and you’ve got some money to spend, and you want to spend it on improving the quality of education for the kids. You’ve decided you can spend it on two things: Hiring more teachers and students lunches. Less the latter example sound trivial, imagine you’re in some high poverty area where a lot of kids can’t afford to eat properly. It’s pretty well established that if you’re starving all day in school then you’re going to perform badly.

We basically have two numbers here completely describing the problem space we care about (we could introduce additional ones, but they wouldn’t change the problem): Number of children who are adequately fed and number of teachers who are employed.

I’ll even make this easier for you and give us a utility function. All we care about is maximizing one number: Say the number of students who get at least one B grade. This should be a poster child for how well subjected expected utility works as a metric.

Increasing either of these numbers while leaving the other one constant will result in a strictly better scenario (assuming the number of teachers is capped to something sensible so we’re not going to find ourselves in a stupid situation where we have twice as many teachers as students). Adding more teachers is good, increasing the number of students who are well fed is good.

What’s complicated is comparing the cases where one number goes up and the other goes down.

It’s not always complicated. Consider two scenarios: Scenario 1 is that we have 0 students well fed and 2 teachers. Scenario 2 is that we have 20 students well fed and 1 teacher. Scenario 2 is obviously better (supposing we have 50 students in our entire student body. If the student body were much larger than one teacher could handle then that might be different).

Now consider Scenario 3: We have 0 students well fed and 9 teachers. This is obviously worse than Scenario 1.

Now consider the transition from scenario 2 to 3: Take lunch away from students, one at a time. At what point does this flip and become no better than scenario 1? Is it really better all the way down to the last student looking forlornly at you as you take their lunch away? Probably not.

But there isn’t really some concrete point at which it flips. There’s one side on which we regard scenario 1 to be obviously worse and one on which we regard it to be obviously better, and a biggish region in between where we just don’t know.

Why don’t we know? We have a utility function! Surely all we need to do is work out the number of utilons a fed student gives us, the number a teacher gives us, add them up and whichever gets the highest score wins! Right?

Obviously I’m going to tell you that this is wrong.

The reason is that our utility function is defined by an outcome, not by the state of the world. Neither teachers nor hungry students contribute to it. What contributes are results. And those results are hard to predict.

We can make predictions about the extreme values relatively easily, but in the middle there’s honestly no way to tell which one will give better results without trying it and seeing. Sure, we could put on our rationality hats, do all the reading, run complicated simulations etc, but this will take far longer than we have to make the decision and will produce a less accurate decision than just trying it and seeing. In reality we will end up just picking an option semi-arbitrarily.

But this is not the same as being indifferent to the choice.

To see why the difference between ignorance and indifference is important, suppose there are at least two scenarios in the grey area between scenarios 2 and 3. Call them A and B, with A having fewer fed students than B. We do not know whether these are better than scenario 1. We do however know that B is a better than A – more students are fed. If we were to treat this ignorance as indifference then transitivity would force us to conclude that we were indifferent between A and B, but we’re not: We have a clear preference.

Note that this failure of totality of the preference relation occurred even though we are living in the desired conclusion of the VNM theorem. The problem in this case was not that we don’t have a utility function, or even that we want to maximize something other than its expected value. The problem is that the lotteries are hidden from us – we do not know what the probabilities involved are, and finding that out would take more than the available time and resources available to us.

You can argue that a way out of this is to use a Bayesian prior distribution for this value with our states as inputs. This is a valid way to do it, but without more information than we have those numbers will be pretty damn near guesses and are not much less arbitrary than using our own intuition. Moreover, this has simply become a normative claim rather than the descriptive one that the VNM professes to make.

This entry was posted in Decision Theory on by .