Test-Case Selection and Choice Theory

Attention conservation notice: Honestly you probably don’t care about this post unless the title sounds really intriguing to you. I’m not sure this matters at all.


At its most abstract, the test case selection problem is a function \(T : \mathcal{P}(A) \setminus \{\emptyset\} \to A\) such that \(T(U) \in U\). i.e. we have some set \(A\) of possible test cases, a bug that occurs in some non-empty subset of the test cases, and we want to select a test case to represent the bug. Call such a \(T\) a choice function.

(The axiom of choice states that choice functions always exist, but our sets of test cases are actually always finite, so we don’t need that here).

This elides some complexity in that:

  1. Our test case selection may be non-deterministic. In this case we can (by fixing the set of random choices made) treat this as a choice-function valued random variable, so it still reduces to this.
  2. The test-case selection might fail. We can ignore this problem by assuming that it will succeed eventually (assuming it doesn’t do something silly) and just running it until it does. For some methods this may take rather a long time and we might have to worry about implementation details like the heat death of the universe, but ¯\_(ツ)_/¯

This is actually a coincidence of naming for the most part, but choice functions are interestingly related to choice theory: The problem of expressing a preference between alternatives, and describing axioms of “rationality” for doing so.

One interesting axiom of rationality is that of contraction consistency. Contraction consistency is the requirement that if \(T(A) \in B \subseteq A\) then \(T(B) = T(A)\). i.e. if you picked \(T(A)\) as the best element of \(A\), removing elements from \(A\) that aren’t \(T(A)\) shouldn’t change your opinion!

On the face of it this seems reasonable, but it actually imposes a very strong restriction on what \(T\) can look like.

Theorem: If \(T\) is contraction consistent then \(T(A) = \min\limits_\prec A\) for some total order \(\prec\).

Proof:

Define \(a \prec b\) if \(a = T(\{a, b\})\). This is antisymmetric because the set \(\{a, b\}\) doesn’t depend on the order of \(a\) and \(b\), and reflexive because \((T(\{a\}) = \in \{a\}\), and total because \(T(\{a, b\} \in \{a, b\}\) so either \(a \prec b\) or \(b \prec a\).

So to show that it’s a total order we now need to show that it’s transitive. Suppose \(a \prec b\) and \(b \prec c\). Then \(T(\{a, b, c\}) = a\): If it were \(c\) then this would violate contraction consistency when considering \(\{b, c\}\), and if it were \(b\) then it would violate it when considering \(\{a, b\}\). Now, by constraction consistency, \(T(\{a, c\}) = T(\{a, b, c\}) = a\), so \(a \prec c\).

Now suppose \(b \in A\). By constraction consistency, \(T(\{T(A), b\}) = T(A)\). Therefore \(T(A) \prec b\), and so \(T(A) = \min\limits_\prec A\) as desired.

QED

What this means in practice is that the only contraction consistent test-case selection methods must be equivalent to bounded exhaustive enumeration: You have some total ordering over your test cases, and in order to return a test case \(a \in A\) you must have verified that \(b \not\in A\) for every \(b \prec a\). This is potentially very expensive if all you have is membership queries for \(A\)! If you have more structure on \(A\) (e.g. because of symbolic execution) then you could potentially rule out this happening without actually performing those membership queries.

If you adopt the classic approach of generating a random test case then running test-case reduction on it, you will typically require substantially fewer membership queries for the set, but the \(\prec\) relationship in the above proof may not even be transitive!

Say, for example, that we have \(A = \mathbb{N}\), and our test case reduction algorithm consists of iterating the operations \(n \to n / 2\) and \(n \to n – 1\) to a fixed point as long as they remain in the set. So for example if we have \(A\) as the set of even numbers and start from \(8\) then we will go \(8 \to 4 \to 2\), having tried \(7, 3, 1\) and concluded they weren’t in the set. If we’d started from \(10\) though we’ve have become stuck because both \(10 – 1\) and \(10 / 2\) are odd. This is then an example of intransitivity, because we have \(5 \prec 10\), and \(4 \prec 5\), but \(4 \not\prec 10\), and this intransitivity is in large part responsible for our failure to find a global minimum.

(Note that it would be perfectly possible to have \(\prec\) be transitive and have \(T\) not be contraction-consistent – just make \(T\) do something different whenever \(|A| > 3\). Contraction consistency implies the transitivity of \(\prec\), but we still needed the full consistency to show that \(T(A)\) was the \(\prec\)-minimum).

This is roughly equivalent to the observation that classic choice theory only really works for logically omniscient agents: Test-case reduction is actually a boundedly rational agent that is unable (or unwilling) to run exponential time algorithms, while a logically omniscient agent is perfectly happy to do that and considers it a free action.

There is also an interesting topological perspective that allows us to firm up what we mean by \(T\) being “black box”, but I think I’ll leave that for another time.

This entry was posted in Numbers are hard, programming on by .

2 thoughts on “Test-Case Selection and Choice Theory

Comments are closed.