So in the type of game we’re playing (part 1, part 2), we play a binary tournament.

Formally, we have a set \(\mathcal{C} = \{1, \ldots, n\}\). A *tournament* on these candidates is a function \(f\) on the subsets of \(\mathcal{C}\) of size \(> 2\) such that \(f(A)\) is a probability distribution over two-element subsets of \(A\). Let \(\mathcal{T}(X)\) be the set of tournaments on \(X\) (the set of tournaments on a set of fewer than two elements is defined but empty)

Given tournaments \(t_1, \ldots, t_n\) we can form a convex combination of them by playing tournament \(i\) with probability \(p_i\) at each step. Some inspection shows that every tournament is a convex combination of (potentially rather a lot of) deterministic tournaments – that is, tournaments such that at each step they produce a single pair with probability 1.

How many such deterministic tournaments are there?

Well, there are \(n \choose k\) subsets of size \(k\). Each of these then has \(k \choose 2\) possibilities to map to. So there are

\(\tau(n) = \prod\limits_{k=3}^n {k \choose 2}^{n \choose k}\)

This is O(pretty hardcore). \(\tau(3) = 3\), which is fine, but \(\tau(4) = 486\) and \(\tau(5) = 4591650240\). I’ve not done precise bounds calculations, but it’s at least \(O(2^{2^n}\).

Anyway, the point being that this space is bloody large. Although convex and linear minimizations are a thing that we can do relatively easily, this space is so infeasibly large that this becomes not an option. You probably *can* run the simplex algorithm on a 4591650240 dimensional space, but it’s going to take a long time

So why did I think I could do optimisations on it?

Well for some functions, you can. Unfortunately this turns out to be a subset of the functions I used the relevant algorithm for. Basically the question is whether you can optimise a tournament based on optimising its subtournaments. If you can then this is “easy” in the sense of only having to perform \(O(2^n)\) optimisations on \(O(n)\) dimensional spaces. Compared to our above super-exponential function this isn’t so bad at all.

Let \(t\) be a tournament and let \(A \subseteq C\). Through a slight abuse of notation, define the restriction \(t|_A\) to be the restriction of \(t\) to subsets of \(A\).

Let \(V\) be a victory distribution. That is, \(V : \mathcal{C}^2 \to \mathbb{R}\) with \(0 \leq V(i, j) \leq 1\) and \(V(i, j) = 1 – V(j, i)\). \(V(i, j)\) is the probability that \(i\) beats \(j\).

We define the *collapse* of a tournament \(t\) with respect to \(V\), which we will write \(c(t, V)\), to be the probability distribution on \(\mathcal{C}\) you get by playing the tournament with this victory function and returning the winner: That is, at each stage from the remaining candidates you use the tournament to pick a pair and the victory function to decide which one of that pair drops out.

Theorem: Let \(f : \bigcup\limits_{A \subseteq \mathcal{C}} \mathcal{T}(A) \to \mathbb{R}\). Suppose there exists a victory function \(V\) and a linear function \(g : \mathcal{L}(\mathcal{C}) \to \mathbb{R}\) (where \(\mathcal{L}(A)\) is the set of random variables taking values in \(A\)) such that \(f(t) = g(c(t, V))\). That is, \(f\) is \(g\) applied to the victory distribution. Then \(f|_{\mathcal{T}(\mathcal{C})}\) is maximized by a tournament \(t\) such that for every \(A \subseteq T\), \(t|_\mathcal{A}\) also maximizes \(f_{\mathcal{T}(A)}\). Further, every tournament maximizing \(f_{\mathcal{T}(A)}\) is the restriction of one which maximizes \(f|_{\mathcal{T}(\mathcal{C})}\). Further, these maxima or minima are achieved by a deterministic tournament.

Proof:

This is because \(c(t|_A, V)\) is a non-negative linear combination of \(\{ c(t|_{A \setminus \{x\}}, V) : x \in A\}\) and thus since \(g\) is linear, \(f(t_A)\) is a non-negative linear combination of \(\{f|_{A \setminus \{x\}} : x \in A\}\). Thus any increase or decrease in these corresponds to an increase or decrease in \(f(t_A)\). Further, because \(f(t_A)\) is only a function of the \(f(t_{A \setminus \{x\}})\), any maxima/minima will work.

Why can we pick a deterministic tournament? Well because linear functions take their extreme values on extreme points, and the extreme points are the deterministic tournaments.

QED

*In particular* this works absolutely fine for finding the best and worst victory probabilities for a given candidate, because these are simple linear functions of the victory distribution (indeed they’re just coordinate functions). So those numbers I quoted last time are perfectly fine.

However the entropy numbers I quoted… might be the most extreme due to the extreme simplicity of the example I used, but in general adopting this approach to finding entropy extreme is completely bogus. Because entropy is concave the entropy maximization approach will probably work pretty well (because the entropy is greater than the entropy of the subtournaments). It might even find the maximum entropy *deterministic* tournament, I’m not entirely sure, but it will not in general find the maximum.

Pingback: sobre C | Physics.MVR