Author Archives: david

A proposal for electoral reform

As you might have gathered, I’m a little bit in favour of electoral reform. I hide it well, but you might have got an inkling of it.

I was very keen on the yes to AV vote, not because I think AV is great – I’m not even 100% sure it would have been an improvement – but because I think the current system is deeply broken and that a vote no can be, would be and indeed was spun into a vote saying that the status quo was A-OK and we shouldn’t change anything.

I don’t think there’s a hope in hell of my preferred choice of voting system ever getting enough popular support. It’s so unlikely I would never even try to be honest – no one is going to be willing to burn political capital over the idea.

But there’s an idea that’s been knocking around in the back of my head for a while which I think could actually work. It’s in some ways a little drastic, but I think with the right spin it could be actually appealing to people.

Here it is, at its simplest:

Each constituency gets to choose its own electoral system.

That’s kinda it. The regions who want to use AV (there weren’t very many of them) get to use AV. The regions who want to use random ballot get to use random ballot (you might get one of them, but I doubt it). If you want to sell range voting to a region, go for it.

Why is this a good idea?

It’s partly because will to change is often very localized. It’s comparatively much easier to get momentum behind a local campaign. As we demonstrated in 2011 it’s very hard to persuade a nation of 63 million people to change its ways. A region of about 70,000 people in comparison is much more localized and experiences a much more coherent set of shared problems. A campaign can target specific issues (This MP gets voted in every year and has for the last 20. Isn’t it time for a change?) and appeal to a more common shared set of experiences.

I also think it just might be a good idea independently of the will to change. It gives you more capability to experiment. Whenever you change the voting system wholesale there’s the possibility of a whole general election going catastrophically wrong. When you change it for a couple regions the worst case scenario is that you have a couple rubbish MPs. Insert your own political commentary here. If the changed electoral system appears to work well then its popularity can spread. If it doesn’t appear to work well then it probably die out.

Here’s how I imagine it might work in practice. This is just a sketch and obviously might have some serious flaws in it.

There is a central registry of acceptable voting systems. This starts out with a single entry: The existing FPTP system.

There is a small (initially) committee whose job it is to vet voting systems. They’re probably appointed. They should consist of a mix of people from different backgrounds – I’d want at least one academic who studies the subject of voting systems and one person with practical experience of being involved in elections. Ideally more. Their job is to assess proposals for electoral systems according to three criteria: Anonymity, unbiasedness and practicality of implementation. Unbiasedness here being both “No single voter has more power than any other voter” (in the sense of “if you swapped who cast these two votes would the election result change”, not in the sense where FPTP gives more power to majority parters) and “No single party has an advantage over any other” (Again in the sense of vote swapping). Practicality of implementation includes things like “Is this completely incomprehensible to people?” (this should be solved by consulting actual people and seeing if it’s comprehensible to them rather than stupid rhetoric). Whether the voting system would be a good idea or not is explicitly outside of their remit.

Voting systems are privately proposed rather than via the committee. An initial proposal for a voting system consists of:

  • A description of the implementation of the system, including how people would vote and how the votes would be counted to produce a result. This is not required to completely specify all implementation details but should be unambiguous
  • A list of not more than 5 people who would be responsible for seeing the proposal through to completion
  • A list of at least 10,000 signatures of people who are prepared to endorse the proposal

The committee then makes a decision on whether it is worth progressing with this initial proposal or whether it is fatally flawed (fatal flaws include “You have proposed a dictatorship” or “Your proposal requires the use of a super computer to implement if there are more than 5 candidates for election”. It does not include “You’re kidding, right?”). If they think it is worthwhile, they will provide a small grant for turning the proposal into a final result. This grant should be enough to pay one or two people a modest salary, fund any research that is needed, etc. I have no idea what is practical here.

A finished proposal for the system should include:

  • A guide on how the voting system works for normal citizens who have to cast a vote. This should be purely practical in nature and without spin
  • A detailed account of how the ballots are to be counted

This should be produced through a process of iteration: A draft form is submitted to the committee and they will either accept it or reject it with comments. This process may continue indefinitely until the committee decided to put their foot down and say they will not accept any form of this proposal, however the grant will dry up at some point. Hopefully rather than either of these things happening the committee will at some point vote to accept the proposal. At that point the proposal enters the list of accepted voting systems.

(There needs to be some process for amendments, etc. I don’t really care what it is. Do something sensible here)

At this point we now have a list of voting systems that are approved for use. Individual constituencies may now choose to switch voting systems. They do this as follows:

At any point a proposal may be put forward by a constituency to switch voting systems. Again, this is privately arranged. The proposal consists of one of the voting systems from the approved list, some suitable deposit (about £10k probably) and signatures from at least 5% of the constituency saying they would like this voting system to be implemented. Upon the successful submission of a proposal, a vote is triggered within that constituency that is a simple yes/no vote for switching to that system. There should be about a six month window between the proposal being accepted and the vote to allow people to properly respond to it. If the yes votes meet some threshold (probably somewhere in the 10-20% region) then the deposit is returned. If more than 50% of the voters vote yes then the constituency has now adopted that system and the next election for an MP there will use it (regardless of whether it’s due to general election, death or disqualification of an MP, etc). If the system changes, it may not be changed again until the new system has been used at least once. If the system is voted out there is now a cool-off period of a year where no one who put their name to this proposal may sign for a new one, and where the rejected system may not be proposed again. Other people are free to propose different systems. This is to stop people basically proposing systems to lock out other people.

Would this work in practice? I don’t know. I think it might. Initially at least you will get very few new systems – It seems likely that it would be a year or more before a new system even made it through the committee, and that you’d be unlikely to see more than a dozen non-FPTP constituencies for a decade or more to come – but I think it would open the door to change, and would avoid some of the visceral reactions to specific voting systems that many people have.

This entry was posted in voting on by .

The horror lurking at the heart of the new hypothesis

I’ve just merged a branch with a redesign of the hypothesis core into master. All the tests are passing, and I’m much happier with the new design than I was with the old one. It takes the idea of the search strategy I mentioned in my previous post and makes it the fundamental building block of the library. This seems to work quite well.

But in doing this rewrite I’ve gone and taken something which was always at the core of hypothesis and made it explicit, and it’s pretty… well actually I don’t want to say it’s bad per se. Using it has actually been very pleasant, and it’s made some very nice things possible that would otherwise have been quite awkward to do.

So while it’s not exactly bad, it’s sure as hell unpythonic.

What is it?

Well, it’s essentially a highly dynamic multiple dispatch object model with prototype based inheritance.

Err, yeah.

Why, David? Why do you do this?

Well, I arrived at it through a sequence of individually logical steps.

Here’s what drove me down this path:

The first requirement:

We need to be able to go from a type to a strategy. e.g. we want to say “give me a strategy for ints”. Or “give me a strategy for strings”. Or “give me a strategy for tuples of (int, str)”.

However, we absolutely don’t want to put the logic for this on the types themselves. We’re not going to monkey patch things to add in a strategy method or anything uncouth like that.

There are two reasons we’re not going to do that:

  1. We want to be able to configure specific strategies per run, e.g. biasing for short lists or only producing alphanumeric strings
  2. It would be gross

Note also that we need to be able to pass instances, not just types here. For example we can ask for a strategy for the tuple (int, str) which will give us a strategy returning pairs where the first is an int and the second a string.

So that’s where the multiple dispatch came from. It’s not implemented in any very clever way, and it currently is closer to just type based overriding than true multiple dispatch because it doesn’t support subtyping (if you define a handler for instances of Foo then it will not trigger for instances of subclasses of Foo. I may fix this at some point but I currently have no need to do so).

Where does the prototype based inheritance come in?

Well, it started at just level one: There was a single default object which all other lookups would delegate to if needed. The reason for this was that it allowed you to both define new strategy handlers wherever you wanted because you could define them on the default object, but you could then override them on a fresh local copy.

Then I realised that having full prototype based inheritance was no more work and was actually quite useful. So I provided that.

Why? Let me show you an example of its usage in practice. Here’s how we define the strategy for strings:

class StringStrategy(MappedSearchStrategy):
    def __init__( self,
                    strategies,
                    descriptor,
                    **kwargs):
        SearchStrategy.__init__(self, strategies, descriptor,**kwargs)
        self.length_strategy = strategies.strategy(int)
        char_strategy = kwargs.get("char_strategy",
                                   OneCharStringStrategy)
 
        cs = strategies.new_child_mapper()
        cs.define_specification_for(str, char_strategy)
        self.mapped_strategy = cs.strategy([str])
 
    def pack(self, ls):
        return ''.join(ls)
 
    def unpack(self, s):
        return list(s)

What are we doing here?

Well, MappedSearchStrategy lets you define a search strategy in terms of another one. You then just need to define methods for converting to and from instances of that other type. In this case we are defining the strategy for strings in terms of the strategy for lists of strings.

Which makes it sound like it’s recursive, but it’s not. Let me draw your attention to the following bit:

        char_strategy = kwargs.get("char_strategy",
                                   OneCharStringStrategy)        
        cs = strategies.new_child_mapper()
        cs.define_specification_for(str, char_strategy)
        self.mapped_strategy = cs.strategy([str])

What do we have here?

We have another strategy for strings which only generates one character strings (python doesn’t have a Character data type). This is not normally installed anywhere.

We now create a new child mapper. This is a fresh SearchStrategies object whose prototype is the SearchStrategies object we started with. We can now define a strategy for strings on it, which overrides strings for that mapper but leaves the original untouched. When we ask it for a strategy for lists of strings, it uses the list strategy it’s inherited and the string strategy that’s just been defined for it to give a strategy for lists of strings that uses the new one character string strategy.

This isn’t the only place it’s used. We also use it in stateful testing in a broadly similar way for generating the individual test steps without interfering with the larger strategies namespace.

Is this all a terrible idea? I dunno. It works pretty well, and it’s allowed to write some quite interestingly powerful code, but I do have the feeling that it’s probably a bit too clever and that there might be a better solution.

This entry was posted in Hypothesis, programming on by .

A possible more principled approach to generation and simplification in hypothesis

Currently the way generation and simplification in hypothesis work are very ad hoc and undisciplined. The API is spread across various places, and the actual behaviour is very under-specified.

There are two perations exposed, produce and simplify. produce takes a size parameter and produces values of the specifed type of about the provided size, whileas simplify takes a value of the specified type and produces a generator over simplified variants of it.

The meaning of these terms is explicit meaning of these terms is deliberately undefined: There is no specified meaning for “about that size” or “simplified variants”.

This post is an attempt to sketch out some ideas about how this could become better specified. I’m just going to use maths rather than Python here – some of the beginnings of this has made it into code, but it’s no more than a starting point right now.

Let \(S\) be a set of values we’ll call our state space. We will call a search tactic for \(S\) a triple:

\[\begin{align*}
\mathrm{complexity} & : S \to [0, \infty) \\
\mathrm{simplify} & : S \to [S]^{< \omega}\\ \mathrm{produce} & : [0, \infty) \to \mathrm{RV}(S) \\ \end{align*}\] Where \(\mathrm{RV}(S)\) is the set of random variables taking values in \(S\) and \([S]^{<\omega}\) is the set of finite sequences taking values in \(S\). These operations should satisfy the following properties:

  1. \(\mathrm{h}(\mathrm{produce}(x)) = \min(x, \log(|S|))\)
  2. \(x \to \mathbb{E}(\mathrm{complexity}(\mathrm{produce}(x)))\) should be monotone increasing in X
  3. \(\mathrm{complexity}(y) \leq \mathrm{complexity}(x)\) for \(y \in \mathrm{simplify}(x)\)

In general it would be nice if the distribution of \(\mathrm{produce}(x)\) minimized the expected complexity, but I think that would be too restrictive.

Where \(\mathrm{h}\) is the entropy function.

The idea here is that the entropy of the distribution is a good measure of how spread out the search space is – a low entropy distribution will be very concentrated, whileas a high entropy distribution will be very spread out. This makes it a good “size” function. The requirement that the expected complexity be monotone increasing captures the idea of the search space spreading out, and the requirement that simplification not increase the complexity captures the idea of moving towards the values more like what you generated at low size.

Here are some examples of how you might produce search strategies:

A search strategy for positive real numbers could be:

\[\begin{align*}
\mathrm{complexity}(x) & = x \\
\mathrm{simplify}(x) & = x, \ldots, x – n \mbox{ for } n < x\\ \mathrm{produce}(h) & = \mathrm{Exp}(e^h - 1) \\ \end{align*}\] The exponential distribution seems to be a nice choice because it's a maximum entropy distribution for a given expectation, but I don't really know if it's a minimal choice of expectation for a fixed entropy. Another example. Given search strategies for \(S\) and \(T\) we can produce a search strategy for \(S \times T\) as follows: \[\begin{align*} \mathrm{complexity}(x, y) & = \mathrm{complexity}_S(x) + \mathrm{complexity}_T(y)\\ \mathrm{simplify}(x, y) & = [(a, b) : a \in \mathrm{simplify}_S(x), b \in \mathrm{simplify}_T(y)] \\ \mathrm{produce}(h) & = (\mathrm{produce}_S(\frac{1}{2}h), \mathrm{produce}_T(\frac{1}{2}h)) \\ \end{align*}\] The first two should be self-explanatory. The produce function works because the entropy of a product of independent variables is the sum of the entropy of its components. It might also be potentially interesting to distribute the entropy less uniformly through the components, but this is probably simplest. You can also generate a search strategy for sequences given a search strategy for natural numbers \(\mathbb{N}\) and a search strategy for \(S\) we can generate a search strategy for \([S]^{< \omega}\): If we define the complexity of a sequence as the sum of the complexities of its components, we can define its simplifications as coordinatewise simplifications of subsequences. The produce is the only hard one to define. Its definition goes as follows: We will generate length as a random variable \(N = \mathrm{produce}_{\mathbb{N}}(i)\) for some entropy \(i\). We will then allocate a total entropy \(j\) to the sequence of length \(N\). So the coordinates will be generated as \(x_i = \mathrm{produce}_S(\frac{j}{N})\). Let \(T\) be the random variable of the sequence produced. The value of \(N\) is completely specified by \(S\), so \(h(S) = h(S,N)\). We can then use conditional entropy to calculate this: We have that \(h(S | N = n) = j\) because we allocated the entropy equally between each of the coordinates.

So \(h(S) = h(N) + h(S | N) = i + j\)

So we can allocate the entropy between coordinates and length as we wish – either an equal split, or biasing in favour of shorter sequences with more complex coordinates or short sequences with complex coordinates.

Anyway, those are some worked examples. It seems to work reasonably well, and is more pleasantly principled than the current ad hoc approach. It remains to be seen how well it works in practice.

This entry was posted in Hypothesis, Numbers are hard, programming, Uncategorized on by .

Another new hypothesis feature: flags

This has the virtue of being the first hypothesis feature that isn’t blatantly stolen, excuse me, “ported”, from Scalacheck.

Instead it’s blatantly stolen from a post by John Regehr about Csmith.

Update: John Regehr pointed out in the comments that I was misattributing this idea to him, and it was in fact due to Alex Groce.

What’s the idea?

The idea is basically to change the distribution of the search inputs. Rather than exploring the whole space more or less uniformly at each size, each probe into the space is more focused. This produces examples which are in some sense more “interesting”.

How does this work?

The idea is that in any given run of the producers we have a set of flags which may be turned on and off. Let me show an example:

@produces(int)
def produce_int(self,size):
    p = 1.0 / (size + 1)
    n =  int(log(random()) / log(1 - p))
    if self.is_flag_enabled("allow_negative_numbers", size):
        if random() <= 0.5:
            n = -n
    return n

So on some runs we will be allowed to produce negative numbers, on some we won’t.

This means that we can, for example, produce very long lists of only positive integers because negative integers were turned off for this run. Previously a list would be all-positive with probability \(2^{-n}\) (where \(n\) is the length of the list), so for example the following would have been unfalsifiable:

from hypothesis import assume, falsify
 
def long_lists_contain_negative_elements(xs):
    assume(len(xs) >= 50)
    assert any((a < 0 for a in xs))
    return True
 
falsify(long_lists_contain_negative_elements, [int])

But with the new flag based code it’s easy to falsify.

The example that actually motivated this was the stateful testing one in the previous post. If we add a “clear” operation to it and allow that as a test step then we become unable to find falsifying examples because the clear step stomps on things too regularly to ever generate long enough examples.

This is also illustrative of why this is useful in general: Bugs will typically be exhibited by the interaction of one or two features. When we’re looking at all the features we may not exercise the interaction of those two enough to actually test them properly, but when we selectively disable features we can study pairwise interactions much more thoroughly.

Currently the way this is implemented internally is quite gross. I wouldn’t look at it if I were you. I’m mostly focused on hypothesis externals right now, and will think about how to better redesign its insides at a later date.

Update: This feature has been temporarily removed due to me being really unhappy with the implementation and it getting in the way of some internals I’m refactoring at the moment. It will be back before the next released version of hypothesis.

This entry was posted in Hypothesis, Uncategorized on by .

Stateful testing with hypothesis

The idea of stateful testing is that it is “quickcheck for mutable data structures”.

The way it works is that rather than trying to produce arguments which falsify an example, we instead try and produce a sequence of operations which break a data structure.

Let me show you. We’re going to start with the following broken implementation of a set:

class BadSet:
    def __init__(self):
        self.data = []
 
    def add(self, arg):
        self.data.append(arg)
 
    def remove(self, arg):
        for i in xrange(0, len(self.data)):
            if self.data[i] == arg:
                del self.data[i]
                break
 
    def contains(self, arg):
        return arg in self.data

Because it uses an array internally to store its items and doesn’t check if an item is already contained when adding it, if you add an item twice and then remove it then the item will still be there.

(Obviously this is a really stupid example, but it should demonstrate how it works for more complicated examples)

Now lets write some tests that break this!

The operations we want to test are adding and removing elements. So we do the following:

from hypothesis.statefultesting import StatefulTest, step, requires
 
class BadSetTester(StatefulTest):
    def __init__(self):
        self.target = BadSet()
 
    @step
    @requires(int)
    def add(self,i):
        self.target.add(i)
        assert self.target.contains(i)
 
    @step
    @requires(int)
    def remove(self,i):
        self.target.remove(i)
        assert not self.target.contains(i)

We can now ask this to produce us a breaking example:

>>> print BadSetTester.breaking_example()
[('add', 0), ('add', 0), ('remove', 0)]

Note the nicely minimized example sequence.

At the moment this code is very much a work in progress – what I have works, but should probably be considered a sketch of how it might work rather than a finished product. As such, feedback would very definitely be appreciated.

This entry was posted in Hypothesis, programming on by .