David R. MacIver's Blog: Objections to the VNM utility theorem, part 2

Objections to the VNM utility theorem, part 2

6 May 2013

This is another post about reasons why I don’t agree with the VNM theorem. I’m still focusing on the actual axioms of it rather than taking it in the broader context that I object to it because, well, frankly it’s much easier.

Here I would like to object to the idea that preferences over lotteries should be complete. That is, for any pair of lotteries \(A, B\) you should be able to say that \(A \leq B\) or \(B \leq A\). I’m not even going to use lotteries to do it, I’m going to use concrete events.

The point I would like to make is that there are multiple relevant notions of indifference between outcomes. One is “I genuinely don’t care about the difference between these outcomes” and another is “I do not have sufficient information to decide which of these outcomes I prefer”. Although your observed behaviour in these cases may be the same (pick an option at random), they differ in important ways. In particular if you treat them as the same you lose transitivity of preference.

Let me pick a worked example. Suppose you’re running a school, and you’ve got some money to spend, and you want to spend it on improving the quality of education for the kids. You’ve decided you can spend it on two things: Hiring more teachers and students lunches. Less the latter example sound trivial, imagine you’re in some high poverty area where a lot of kids can’t afford to eat properly. It’s pretty well established that if you’re starving all day in school then you’re going to perform badly.

We basically have two numbers here completely describing the problem space we care about (we could introduce additional ones, but they wouldn’t change the problem): Number of children who are adequately fed and number of teachers who are employed.

I’ll even make this easier for you and give us a utility function. All we care about is maximizing one number: Say the number of students who get at least one B grade. This should be a poster child for how well subjected expected utility works as a metric.

Increasing either of these numbers while leaving the other one constant will result in a strictly better scenario (assuming the number of teachers is capped to something sensible so we’re not going to find ourselves in a stupid situation where we have twice as many teachers as students). Adding more teachers is good, increasing the number of students who are well fed is good.

What’s complicated is comparing the cases where one number goes up and the other goes down.

It’s not always complicated. Consider two scenarios: Scenario 1 is that we have 0 students well fed and 2 teachers. Scenario 2 is that we have 20 students well fed and 1 teacher. Scenario 2 is obviously better (supposing we have 50 students in our entire student body. If the student body were much larger than one teacher could handle then that might be different).

Now consider Scenario 3: We have 0 students well fed and 9 teachers. This is obviously worse than Scenario 1.

Now consider the transition from scenario 2 to 3: Take lunch away from students, one at a time. At what point does this flip and become no better than scenario 1? Is it really better all the way down to the last student looking forlornly at you as you take their lunch away? Probably not.

But there isn’t really some concrete point at which it flips. There’s one side on which we regard scenario 1 to be obviously worse and one on which we regard it to be obviously better, and a biggish region in between where we just don’t know.

Why don’t we know? We have a utility function! Surely all we need to do is work out the number of utilons a fed student gives us, the number a teacher gives us, add them up and whichever gets the highest score wins! Right?

Obviously I’m going to tell you that this is wrong.

The reason is that our utility function is defined by an outcome, not by the state of the world. Neither teachers nor hungry students contribute to it. What contributes are results. And those results are hard to predict.

We can make predictions about the extreme values relatively easily, but in the middle there’s honestly no way to tell which one will give better results without trying it and seeing. Sure, we could put on our rationality hats, do all the reading, run complicated simulations etc, but this will take far longer than we have to make the decision and will produce a less accurate decision than just trying it and seeing. In reality we will end up just picking an option semi-arbitrarily.

But this is not the same as being indifferent to the choice.

To see why the difference between ignorance and indifference is important, suppose there are at least two scenarios in the grey area between scenarios 2 and 3. Call them A and B, with A having fewer fed students than B. We do not know whether these are better than scenario 1. We do however know that B is a better than A - more students are fed. If we were to treat this ignorance as indifference then transitivity would force us to conclude that we were indifferent between A and B, but we’re not: We have a clear preference.

Note that this failure of totality of the preference relation occurred even though we are living in the desired conclusion of the VNM theorem. The problem in this case was not that we don’t have a utility function, or even that we want to maximize something other than its expected value. The problem is that the lotteries are hidden from us - we do not know what the probabilities involved are, and finding that out would take more than the available time and resources available to us.

You can argue that a way out of this is to use a Bayesian prior distribution for this value with our states as inputs. This is a valid way to do it, but without more information than we have those numbers will be pretty damn near guesses and are not much less arbitrary than using our own intuition. Moreover, this has simply become a normative claim rather than the descriptive one that the VNM professes to make.

Comments

Paul Crowley on 2013-05-06 18:39:51:

I think you’re confusing levels here. To subscribe to VNM rationality, you need only to have preferences over actual outcomes, or lotteries over outcomes. So if you can decide whether you’d rather that 20 students get Bs, or a 50/50 chance that either 40 or 10 get Bs, and if your uncertainty about what course of action will produce which outcome is Bayesian rather than Knightian, then you can be VNM rational about the whole thing.

david on 2013-05-06 18:51:33:

Firstly, I don’t think that’s a valid distinction. Everything is an outcome. There’s the shorter term outcome which I have control over (the lunches and teachers) and the longer term outcomes which are the ones I want to optimise for here (the grades), and the yet longer term outcomes that are my true motivation for optimising for that proxy score (the success of the children I’m educating).

Secondly, I think you’ve missed my point: I have deliberately constructed a value function here where I am indeed trying to make decisions based on the expected value of that function. This *still* doesn’t satisfy the completeness axiom, because I am unable to observe the actual lotteries, and I have no sensible way to estimate it. I might be able to come up with some best guesses which would at least draw rough boundaries and narrow the gray area, but that will in a best case scenario still only be a marginal improvement over guessing.

It is certainly the case that you can apply better tools of reasoning here to help you make the decision, and that Bayesian reasoning might even be (likely, is) a valid way to do that, but this is a) At best a modest improvement over naive reasoning strategies without first gathering an awful lot more evidence than we necessarily have and b) Has very much moved us from a descriptive theory to a normative one.

Paul Crowley on 2013-05-06 20:30:03:

I *think* you’re talking about Knightian uncertainty here, am I right? I think it would be hard for me to clear up your confusion in comments since you seem confident about what you’re saying; I’d be delighted to try over a pint sometime :) I think of VNM entirely as normative; we can be pretty sure it’s not descriptive of human behaviour :)

david on 2013-05-06 20:44:56:

What I’m talking about isn’t exactly Knightian uncertainty, but they may be equivalent in practice. The issue is more cases where there is uncertainty you could resolve in principle, but the cost of resolving that uncertainty is comparable to or greater than the cost of just making the decision.

I would be happy to discuss this further over a pint. :-)

Paul Crowley on 2013-05-06 21:56:03:

Excellent, pint it is :)

If you don’t have Knightian uncertainty, then for every T, M and B you can assign P(B grade Bs || T teachers and M meals). Then for every T, M pair you can find E(B||T, M) and whichever delivers the highest on that ranking is preferred.

david on 2013-05-07 08:03:29:

Well, no, the thing is there’s a sliding scale here. On the one end there’s Bayesian “I have a precisely pinned down distribution representing my beliefs about my uncertainty” and on the other there’s Knightian “I cannot resolve my uncertainty even in principle” and in the middle there’s a wide range where you have some rough ideas about what the right models might look like, and you could through a process of reasoning and evidence gathering improve your models, but the effort of doing so is too great to be feasible.

To put it another way, what I’m positing is happening here is that we have a VNM satisfying preference relation, but it’s expensive to compute, so what we actually observe in practice is the subset of the preference relation where we can calculate the answer in some reasonable amount of time. I’m pointing out that in the preference structure constructed this way there is an important structural difference between “I don’t know” (my decision procedure timed out) and “I am indifferent” (my decision procedure returned indifference).

Paul Crowley on 2013-05-08 08:36:50:

OK, so it sounds like you’re worried about logical uncertainty. The VNM theorem only applies to logically omniscient agents; decision theory for agents of bounded reasoning capacity is a hard and unsolved problem. However, the right conclusion to draw from this is not “I now discard VNM altogether in favour of what my intuitions told me in the first place”.

david on 2013-05-08 10:30:44:

Firstly, it’s not just bounded rationality that’s the issue here, it’s also bounded evidence. Even if I am a super-intelligent infinitely fast quantum super computer, I may still not have enough evidence about the world to be able to reach a useful conclusion.

Secondly, bounded evidence and rationality are the two single hardest problems in making decisions! A decision theory which assumes I’m infinitely intelligent and in possession of all the facts is about as useful to me as a theory of walking which assumes I have six legs.

Thirdly, I think you’re privileging the hypothesis “VNM theory is correct”. I am not discarding VNM theory, I’m not accepting it in the first place because I find the the arguments in favour of it unconvincing and the decision theory it produces unhelpful.

Finally, I am not arguing in favour of “we should just go with our intuitive judgements”. I’m all in favour of ways to improve how we make decisions. I just don’t think overly simplistic mathematical rules which rely on discarding all the things that make the problem actually hard are a good way of doing that.

Paul Crowley on 2013-05-08 15:16:08:

It’s very hard to argue with someone who is both very confused and very confident! Happily it’s much easier in person :)

david on 2013-05-08 15:24:09:

It’s very easy to perceive disagreeing with you as confusion. :-)