I was rereading Epistemology and the psychology of human judgement recently.
One of its major points is the existence of SPRs for a lot of tasks – simple linear rules that add up some weighted sum of various easy to measure statistics and provide a simple binary decision based on that score. For many tasks, SPRs not only out-perform human experts, they out-perform human experts who have access to the result of the SPR. i.e. when someone looks at the result of the SPR and decides to defect from its decision, they’re wrong more often than they’re right.
I have no trouble believing this, but I have some trouble believing that this is true even after people spend time specifically trying to learn how to outsmart the SPR. In particular I also recently read How to measure anything, and one of the points it makes is that it’s actually relatively easy to train people to give calibrated probability estimates (e.g. I am 90% sure this value lies within this range). It seems that if you can teach people to calibrate like that you should also be able to teach them to calibrate on their defection from SPRs.
I was thinking about how you might design an experiment to test this and I accidentally designed a computer game instead. It’s in some ways pretty similar to Papers Please with different incentive structures and a twist.
The idea is this: You are a psychologist employed by the justice system. Your job is to interview potential parolees and decide whether they are likely to re-offend. You have an SPR which tells you what it thinks and it’s right about 70% of the time. You then get to conduct a short several minute interview with them. At the end of which you have to state whether you believe they are likely to re-offend.
If you say they are unlikely to re-offend they will get released. At some point later – maybe immediately, maybe quite a while down the line – you may discover that they have done something violent and your recommendation was less than stellar.
Your goal is to not get fired. More accurately, your goal is to last long enough without getting fired that this job looks good on your CV and you can get the hell out of there and go do something you don’t hate.
There are three things that will get you fired:
- Releasing too many people who turn out to be violent. The more violent they are the worse this is – someone who gets into a bar fight won’t count against you much, someone who shoots up a crowded public area will basically get you instant-fired. Also, cases where the rule predicted violence and you disagreed will be judged extra harshly.
- Not releasing enough people. Prisons are crowded, y’know, and we need to be shown to be fair and impartial. To start with you’ll be pretty free on this front but as the game goes on you’ll be under pressure to release more people.
- Agreeing with the rule too often. The rule’s accuracy and the game’s loss thresholds should be calibrated so that you will do pretty well with the above conditions if you always follow the rule – you’ll release enough people and most of them won’t be very violent. You’ll probably last quite a while, but at some point your boss will notice that you’re adding literally no value over this really much cheaper rule they could be using instead and decide they might as well just do that.
The game is a nice mix of an educational game about calibration and learning and a game which forces you to confront the fact that its perverse incentives are making you behave like a terrible person. At the end you should probably also get a score which is the number of people who would have been just fine but whom you cruelly kept locked up.
My major concern would be that it might turn out that you can’t learn to do better than the SPR, but I feel like this is relatively easily soluble in a game at least by introducing a couple hidden variables that the SPR doesn’t have access to but are relatively easy for you to deduce, or by making the decision boundary more complex than a simple linear rule can easily adapt to.