Lets play a game

Edit: I’ve noticed that my optimisation functions, well, don’t actually optimise. They successfully produce a somewhat optimal value of the function they’re optimising and produce valid tournaments, but in order to produce a global optimum they make certain assumptions about the shape of the function they’re optimising that are not valid. So take any claims of optimality in this post with a pinch of salt.

The game we’re going to play isn’t very fun. But that’s OK, because it’s not actually going to be us playing it. This is a very simple game that is designed as a starting point for exploring some of the properties of the tournament rigging library I’m working on.

The game we’re going to play is good-card bad-card. Each player has a deck. This deck has two types of cards in it: Good and bad.

Play proceeds as follows: Each player draws a card. If one of them draws a good card and the other draws a bad card, the player with the good card wins. Else if they both have the same card, they shuffle their cards back into their respective decks and play another round. They do this until someone wins a round.

The interesting feature of the game comes from the fact that different players have different fractions of good cards in their hands (every player has at least one good card and at least one bad card, preventing the possibility that both of them will be sitting there endlessly drawing the same card). This means that in any given situation one player may have a clear advantage.

Suppose we have players $$0, \ldots, n$$, and suppose player $$i$$ has a fraction of good cards $$g_i$$. Then the probability of player $$i$$ winning a round against player $$j$$ is $$g_i – g_i g_j$$ (because this is the probability of $$i$$ drawing a good card minus the probability of them both drawing a good card). This means the probability of $$i$$ winning the whole game is $$\frac{g_i – g_i g_j}{g_i + g_j – 2 g_i g_j}$$ (because the probability of winning the game is proportional to the probability of winning any given round).

Lets do a worked example of what this looks like:

In [1]: import tournament as t In [2]: import numpy as np In [3]: x = x = np.array([ 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]) In [4]: Ax = t.good_card_bad_card(x) In [5]: t.victory_probabilities(Ax, t.even_selector) Out[6]: array([ 0.00128291, 0.00392595, 0.00865896, 0.01682629, 0.03099347, 0.05650568, 0.10608786, 0.21727215, 0.55844674])

This is the probability of victory when each pair of contenders is picked at random from the set of remaining pairs. In this particular model of tournaments (where there’s no state to pair off people evenly) this seems the “fairest” and, unsurprisingly, the better your score the better your chances of winning. To almost a surprising degree really: The deck with 90% good cards is more than twice as likely to beat the deck with 80% good cards, despite the fact that in a head to head fight it only has a 69% chance of winning.

Two other… I won’t say obvious, but at least somewhat natural, choices are to maximize and minimize the entropy of the results. Maximizing the entropy could be regarded as keeping the competition exciting, minimizing could be regarded as trying to highlight the actually best candidate.

In [7]: t.victory_probabilities(Ax, t.max_entropy_selector) Out[7]: array([ 0.10300318, 0.10353773, 0.08934311, 0.07525026, 0.06495158, 0.05973063, 0.06176968, 0.08015912, 0.36225472])   In [8]: t.victory_probabilities(Ax, t.min_entropy_selector) Out[8]: array([ 0.00162647, 0.00362818, 0.0052852 , 0.00680589, 0.00863982, 0.01165419, 0.01830958, 0.08015912, 0.86389154])

The max entropy tournament is… weird. It results in the lower scoring items being more likely to win than the mid scoring items! Why does that happen? Well, basically because even competitions are higher entropy than unbalanced ones (entropy is approximately “how much information does knowing the result of this event give me on average?”. When an event is a foregone conclusion knowing its outcome gives you very little information).

An interesting thing is how these relate to the tournaments which are rigged for or against the top candidate:

In [9]: t.victory_probabilities(Ax, t.boosting_selector(8)) Out[9]: array([ 0.00162647, 0.00362818, 0.0052852 , 0.00680589, 0.00863982, 0.01165419, 0.01830958, 0.08015912, 0.86389154])   In [10]: t.victory_probabilities(Ax, t.killing_selector(8)) Out[10]: array([ 0.00178168, 0.01591364, 0.01091917, 0.03665989, 0.06297012, 0.11254326, 0.13593336, 0.26102417, 0.36225472])

Interesting to note is that the probability of the top candidate winning is the best possible under the minimum entropy tournament and the worst possible under the maximum entropy tournament. This isn’t terribly surprising given what they’re doing: In the minimum entropy tournament it pays to keep the top candidate around. In the max entropy tournament it pays to kill them off.

A thing in there I’m a little suspicious of is that the min entropy tournament appears to correspond exactly to the tournament designed to kill off the top candidate. This is suspicious because when the top candidate has been eliminated this tournament falls back to even selection, which is not the mininum entropy solution in general. I think this is a bug in my entropy minimization/maximization code which I’m going to have to think about how to solve – the problem is likely that while it correctly chooses the single pair that will optimize the entropy of the child, that’s not actually necessarily the best way to optimize the overall entropy because we can choose a distribution over tournaments. For minimization I think we want to always pick a single tournament, and for maximization I think we might want to pick a more complicated distribution over tournaments. So take this section with a grain of salt: These are still valid tournaments with low and high entropies respectively, but they may not be the global minima and maxima.

One thing that does seem to be correct is if we look at the entropies of all the tournaments designed to boost a single candidate, the entropy decreases as the score of the candidate increases:

In [11]: np.array([t.entropy(t.victory_probabilities(Ax, t.boosting_selector(i))) for i in xrange(9)]) Out[11]: array([ 1.96741165, 1.89865646, 1.80417052, 1.69265673, 1.56505675, 1.41703054, 1.23658575, 0.99217258, 0.5873777 ])

Lets take a look at what the best and worst probabilities for each candidate are:

In[12]: np.array([t.victory_probabilities(Ax, t.boosting_selector(i))[i] for i in xrange(9)]) Out[12]: array([ 0.10300318, 0.18244198, 0.24852718, 0.30746763, 0.36347069, 0.4205425 , 0.48460246, 0.57025324, 0.86389154]) In [13]: np.array([t.victory_probabilities(Ax, t.killing_selector(i))[i] for i in xrange(9)]) Out[13]: array([ 9.35042740e-10, 3.05782789e-07, 8.93073705e-06, 1.02246579e-04, 7.25760000e-04, 3.93070198e-03, 1.83095795e-02, 8.01591233e-02, 3.62254715e-01])

So unsurprisingly these are still increasing with score. Further, your worst candidate can never be that likely to win, and your best candidate is never that likely to lose: The worst victory probability for the top candidate is still more than three times better than the best victory probability for the worst candidate.

I’m not really sure where I’m going with this, except to say “yep, for this very simple game tournaments mostly align with our intuition”. For this simple game with a straight linear ordering of the candidates in terms of quality, you can distort the results quite far, but not impossibly far.

Next I’m going to look at some more interesting games where you can have non-transitive victory probabilities. I’m not yet sure, but I suspect you can get much more interesting behaviour there.

This entry was posted in Uncategorized on by .