This is another post about reasons why I don’t agree with the VNM theorem. I’m still focusing on the actual axioms of it rather than taking it in the broader context that I object to it because, well, frankly it’s much easier.
Here I would like to object to the idea that preferences over lotteries should be complete. That is, for any pair of lotteries \(A, B\) you should be able to say that \(A \leq B\) or \(B \leq A\). I’m not even going to use lotteries to do it, I’m going to use concrete events.
The point I would like to make is that there are multiple relevant notions of indifference between outcomes. One is “I genuinely don’t care about the difference between these outcomes” and another is “I do not have sufficient information to decide which of these outcomes I prefer”. Although your observed behaviour in these cases may be the same (pick an option at random), they differ in important ways. In particular if you treat them as the same you lose transitivity of preference.
Let me pick a worked example. Suppose you’re running a school, and you’ve got some money to spend, and you want to spend it on improving the quality of education for the kids. You’ve decided you can spend it on two things: Hiring more teachers and students lunches. Less the latter example sound trivial, imagine you’re in some high poverty area where a lot of kids can’t afford to eat properly. It’s pretty well established that if you’re starving all day in school then you’re going to perform badly.
We basically have two numbers here completely describing the problem space we care about (we could introduce additional ones, but they wouldn’t change the problem): Number of children who are adequately fed and number of teachers who are employed.
I’ll even make this easier for you and give us a utility function. All we care about is maximizing one number: Say the number of students who get at least one B grade. This should be a poster child for how well subjected expected utility works as a metric.
Increasing either of these numbers while leaving the other one constant will result in a strictly better scenario (assuming the number of teachers is capped to something sensible so we’re not going to find ourselves in a stupid situation where we have twice as many teachers as students). Adding more teachers is good, increasing the number of students who are well fed is good.
What’s complicated is comparing the cases where one number goes up and the other goes down.
It’s not always complicated. Consider two scenarios: Scenario 1 is that we have 0 students well fed and 2 teachers. Scenario 2 is that we have 20 students well fed and 1 teacher. Scenario 2 is obviously better (supposing we have 50 students in our entire student body. If the student body were much larger than one teacher could handle then that might be different).
Now consider Scenario 3: We have 0 students well fed and 9 teachers. This is obviously worse than Scenario 1.
Now consider the transition from scenario 2 to 3: Take lunch away from students, one at a time. At what point does this flip and become no better than scenario 1? Is it really better all the way down to the last student looking forlornly at you as you take their lunch away? Probably not.
But there isn’t really some concrete point at which it flips. There’s one side on which we regard scenario 1 to be obviously worse and one on which we regard it to be obviously better, and a biggish region in between where we just don’t know.
Why don’t we know? We have a utility function! Surely all we need to do is work out the number of utilons a fed student gives us, the number a teacher gives us, add them up and whichever gets the highest score wins! Right?
Obviously I’m going to tell you that this is wrong.
The reason is that our utility function is defined by an outcome, not by the state of the world. Neither teachers nor hungry students contribute to it. What contributes are results. And those results are hard to predict.
We can make predictions about the extreme values relatively easily, but in the middle there’s honestly no way to tell which one will give better results without trying it and seeing. Sure, we could put on our rationality hats, do all the reading, run complicated simulations etc, but this will take far longer than we have to make the decision and will produce a less accurate decision than just trying it and seeing. In reality we will end up just picking an option semi-arbitrarily.
But this is not the same as being indifferent to the choice.
To see why the difference between ignorance and indifference is important, suppose there are at least two scenarios in the grey area between scenarios 2 and 3. Call them A and B, with A having fewer fed students than B. We do not know whether these are better than scenario 1. We do however know that B is a better than A – more students are fed. If we were to treat this ignorance as indifference then transitivity would force us to conclude that we were indifferent between A and B, but we’re not: We have a clear preference.
Note that this failure of totality of the preference relation occurred even though we are living in the desired conclusion of the VNM theorem. The problem in this case was not that we don’t have a utility function, or even that we want to maximize something other than its expected value. The problem is that the lotteries are hidden from us – we do not know what the probabilities involved are, and finding that out would take more than the available time and resources available to us.
You can argue that a way out of this is to use a Bayesian prior distribution for this value with our states as inputs. This is a valid way to do it, but without more information than we have those numbers will be pretty damn near guesses and are not much less arbitrary than using our own intuition. Moreover, this has simply become a normative claim rather than the descriptive one that the VNM professes to make.