Author Archives: david

Looking into doing a PhD

As regular readers of this blog have probably figured out, I’m a researchy sort of person.

A lot of my hobbies – maths, voting theory, weird corners of programming, etc – are research oriented, and most of my work has had some sort of research slant to it.

The last two years I’ve basically been engaged in a research project working on Hypothesis. It’s come quite far in that time, and I feel reasonably comfortable saying that it’s the best open source property based testing library on most metrics you’d care to choose. It has a number of novel features and implementation details that advance the state of the art.

It’s been pretty great working on Hypothesis like this, but it’s also been incredibly frustrating.

The big problem is that I do not have an academic background. I have a masters in mathematics (more technically I have a BA, an MA, and a CASM. Cambridge is weird. It’s entirely equivalent to a masters in mathematics though), but that’s where I stopped. Although it says “DR” in my online handle and the domain of this blog, those are just my initials and not my qualification.

As a result, I have little to no formal training or experience in doing academic research, and a similarly low understanding of who’s who and what’s what within the relevant fields. So I’ve been reading papers and trying to figure out the right people to talk to all on my own, and while it’s gone OK it’s still felt like fumbling around in the dark.

Which leads to the obvious solution that I spoilered in the title: If the problem is that I’m trying to do research outside of an academic context, the solution is to do research in an academic context.

So I’d like to do a PhD that is either about Hypothesis, or about something close enough to Hypothesis that each can benefit from the other.

There’s probably enough novel work in Hypothesis already that I could “just” clean it up, factor it out, and turn it into a PhD thesis as it is, but I’m not really expecting to do that (though I’d like that to be part of it). There are a number of additional directions that I think it would be worth exploring, and I expect most PhD funding will come with a focus subject attached which I would be happy to adapt to (a lot of the most interesting innovations in Hypothesis came because some external factor forced me to think about things in ways I wouldn’t otherwise have!). If you’d like to know more, I’ve written up a fairly long article about Hypothesis and why I think it’s interesting research on the main Hypothesis site.

Which, finally, brings me to the main point of the post: What I want from you.

I’m already looking into and approaching potential universities and interesting researchers there who might be good supervisors or able to recommend people who are. I’ve been in touch with a couple (some of whom might be reading this post. Hi), but I would also massively appreciate suggestions and introductions.

So, if you work in relevant areas or know of people who do and think it would be useful for me to talk to, please drop me an email at [email protected]. Or just leave a comment on this blog post, tweet at me, etc.

This entry was posted in Hypothesis, life, programming, Python on by .

Laziness is better when it’s visible

This is a trick I invented a while ago. I’m morally certain it’s a reinvention rather than an invention, but I’ve not really seen it in use and at the very least it doesn’t seem to be widely known. I recently ran into a situation where a library would benefit from it greatly and doesn’t use it, so I thought I would write it up.

Suppose we have a bunch of strings and we want to concatenate them.

def cat(my_strings):
    result = ''
    for s in my_strings:
       result += s
    return result

That’s how you concatenate strings, right?

Oh no, accidentally quadratic!

You can fix this using a special string buffer type, or in python by just using ”.join(my_strings), but wouldn’t it be nice if you didn’t have to? It’s often very convenient to build things up using expressions. Although it’s no great hardship for strings to do bulk operations, you run into the same problem in e.g. pulp where you have more complex expressions (and no corresponding sum method in the library). It would be great if this all just worked.

One way to do this sort of thing is to switch to an immutable tree based representation like a rope where the concatenation operation has a more reasonable complexity (usually log(n)).

But that then comes with its own costs. Using a tree structure slows down access and iteration – only by an O(log(n)) factor, but with relatively high constants. As a result you end up paying a relatively high performance penalty for what was mostly just wanted as a convenience (ropes do have other advantages, but that’s not what we’re looking at here).

But what if you didn’t have to do any of that? What if you could get O(1) concatenation and all the benefits of the compact representation? It sounds implausible, but I’m here to tell you you can!

Or, rather, that you almost can. Both of the above are true: You do get O(1) concatenation and you do get all the benefits of the compact representation, but you do end up paying some additional cost, because the idea is to use laziness. So although concatenation is O(1) you end up paying an additional O(n) cost the first time you want to use the result. Fortunately this still avoids the problem of sums being quadratic.

The key idea is to use a data structure that can swap out its implementation on demand. It starts by just storing the abstract expression tree that lead to it, and then switches to a more efficient representation as soon as you need it to.

e.g. the version of this for the above is that a string becomes a binary tree where the leaves are buffers and the branches indicate that the string is a concatenation of its left and right parts. Concatenation is then just creating a new split node, which is O(1).

Then, once we want the compact representation (which will tend to be as soon as we start doing interesting operations on the data – because the expression tree is entirely unnormalized there is very little we can usefully do to it that isn’t an O(n) operation!), we calculate that, store the result on the string and throw away the expression data that brought us here.

That is, as soon as we have forced the string, the string’s switches to a new representation using the forced buffer, essentially replacing the split node with a leaf node.

This feels like we’re back where we started – if you’re doing this lazily like that then you’re just summing together two string children so you’re quadratic again – but we’re not, for one very important reason: Because the implementation of the laziness is under our control, we can tell whether a string has already been forced or not. When forcing a node we then don’t force its child nodes, but instead just walk the tree and behave appropriately when we get to the leaves.

This sort of thing can be very useful, because the common cases where this runs into issues is that you have a complex expression graph and only actually care about a very small fraction of the subexpressions (e.g. in the sum case).

This isn’t always a win, in that it does behave suboptimally under some workloads (e.g. when you do care about a lot of the intermediate results but process them in the reverse of the order you created them), but it’s rarely a substantial loss and usually produces dramatic speedups by converting accidentally quadratic cases into the desired linear behaviour.

There are additional tricks you can build on top of this:

  • You can precompute some data so you don’t always have to force the structure. e.g. you can always calculate the length of the string in the above example without forcing it and still have the operations be O(1)
  • you can sometimes have operations that only require partially forcing the data structure (e.g. if you index into a string you might only have to force one half of it (or neither if the index is out of bounds!)
  • If you have more complex operations then you can do a sort of “query optimization” to rewrite the expression tree into a more efficient execution plan. For example, a thing I’ve done in the past is when the operation is intersection you can rewrite it so that intersections are processed in order of increasing size, which often ends up with you being able to terminate early because you’ve discovered that the end result is going to be empty regardless of what happens next.

Depending on circumstances, any of the above can be worth doing, but most of the big wins come from the basic trick which is almost always a valuable one if you’re running into this sort of problem.

Using this technique won’t always be an improvement – e.g. you’ll end up doing some duplicated work if you do something like x + x because it will effectively recompute forcing x twice – but most of the work loads on which it will behave particularly badly are ones that you should probably have avoided anyway with the normal approach. The only real downsides where you do suffer a hit from using this is that the laziness adds an additional check to each operation, which can be anywhere between a small and modest performance hit depending on how expensive the operation normally is. Additionally, if you want the operations to be thread safe then you’ll need a memory barrier of some sort (e.g. making the relevant field volatile) to get the visibility right, which adds another small hit.

So it’s not a universal win, but the cost is light enough and there are enough work loads where it improves behaviour substantially that it is often worth considering.

To finish off, and make this more concrete, here’s some Python code implementing this idea for strings:

(Like this post? Want to see more like it? Why not support my Patreon! You’ll get to see drafts of upcoming posts and also increase the amount I write)

This entry was posted in programming, Python on by .

How likely are Condorcet cycles in supermajorities?

You can use a supermajority rule to eliminate the possibility of Condorcet cycles. e.g. if you require a \(\frac{2}{3}\) supermajority there is no possibility of creating a Condorcet triple with the majority strictly preferring A to B, B to C and C to A.

Unfortunately that’s only the three candidate case. If you consider the \(N\) candidate case you need a supermajority of \(1 – \frac{1}{N}\) to achieve the same result (just consider the \(N\) cyclic permutations of the candidates and you get a generalisation of the Condorcet example where every candidate beats every other this fraction of the time).

So there is no supermajority rule that in general prevents Condorcet cycles.

In “The Probability of Condorcet Cycles and Super Majority Rules”, Yves Balasko and Hervé Crès show the following asymptotic result: If you consider a profile of votes chosen uniformly at random on the set \(\{p \in \mathbb{R}^{n!}: p \geq 0, \sum p = 1\}\), then the probability of there being a Condorcet cycle in that set with a supermajority of \(\tau\) is bounded above by \(n! \left( \frac{1 –  \tau}{0.4714}\right)^{n!}\).

From this they conclude that the critical threshold for a supermajority is around 54%, because this goes very rapidly towards zero.  e.g. with the 54% threshold the probability of a cycle in \(7\) candidates is \(< 10^{-52}\), which we can probably treat as adequately safe.

They also say “Within our setup, this makes the Condorcet cycles a theoretical curiosity without any practical bearing for super majority rules that exceed the threshold value of 53%”.

Unfortunately this conclusion is totally wrong (the bound is probably correct, but I’ll confess to not having fully followed the rather complicated calculations), because it neglects the fact that \(n\) can be small as well as large.

e.g. for \(n=3\), their upper bound isn’t even smaller than \(1\). It’s about 5.18. It then grows before dropping – at 4 it’s 13.34, at 5 it’s 6.36, and then finally at 6 is becomes small and is 1.6e-5.

So in order to find what the appropriate super majority threshold is for making the probability of a cycle adequately small (say under 1 in 1000) we need to manually check the cases of 3, 4 and 5.

Fortunately it’s easy to do that by direct simulation, so I wrote some code, and the 54% threshold is indeed the wrong one and fails to handle the three candidate case well.

About 6% of elections on three candidates have majority cycles, and of those about a quarter would still have a cycle with a 54% supermajority. The 99.9% mark is a 59% supermajority: Above that, fewer than one in a thousand profiles have a cycle. For four candidates, the 99.9% mark is 57%.

If we take the higher requirement of one in a million (which is well into the region where my simulation does not have sufficient fidelity to actually give a good answer) the thresholds become 64% and 61%.

That leaves the n=5 case. My simulation is a bit too inefficient (it has an O(n!) componetn in it) to run in reasonable time for n=5, but fortunately it doesn’t have to. The upper bound for the supermajorities the lower n required is now adequately low: For a 59% supermajority it’s 6e-6, for a 60% supermajority it’s 3e-7, so goes below the one in a million mark too.

So, in conclusion, if you require a 60% supermajority, and not just a 54% one, you are probably free of Condorcet cycles. If you go for the full 2/3 supermajority (which probably makes a lot of sense for serious cases) you’re almost certainly safe – you’re provably safe at n=3 and, provably have vanishingly low probability for n >= 5, and for n=4 a simulation says there’s well under a one in a million chance of getting a cycle.

Unless of course your population is drawn from a model that isn’t well represented by a uniform distribution (i.e. all populations ever) or the candidates are adversarially chosen to manipulate the electorate (i.e. all candidates ever). In those edge cases you might still have some problems.

This entry was posted in voting on by .

Programmer at large: How did people come up with this?

This is the latest chapter in my serial novel, Programmer at Large. If you’ve not been reading so far, the easiest place to read the archives are at the Archive of Our Own mirror.


“OK. Have you heard about gender?”

It didn’t ring a bell, and my HUD wasn’t indicating any sensible translation for the concept to ones I had, so probably not.

“I don’t think so?”

“OK. So you know how grounders often have all these weird appearance based caste systems?”

I nodded.

“Right, the skin colour thing.”

Now I felt bad that the first thing I’d noticed about them was their pale skin.

“Yeah, that’s the big one. Causes us no end of trouble. Anyway- argh, waste it. I’m sorry, can we put this on pause? I’m getting an alarm that I need to do another set before I cool down too much.”

I did a quick check to see if they were just blowing me off, but the alarm was legit, and the social cues system suggested they were perfectly comfortable up until the alarm annoyed them.

“Sure, no problem. I guess I should do another circuit or two too.”

“Great. This is my last set, so how about we meet up at the hot tub?”

I queried my program.

“Sure, I’m good to stop after this set too.”

I heaved myself up and broke into a run again, which gave me time to think things through.

That hadn’t gone… too badly. I fumbled my introduction something awful, but I think I recovered well enough. I then went straight for the personal questions, which wasn’t great but they seemed happy enough to discuss it and were generally a genial sort. All told, not too bad. Still plenty of scope to mess things up, but doing OK for a first meeting.

And the rest of the conversation would be had in a hot tub, which helps. It’s hard to get too stressed out in a hot tub.

Gravity is awful and I hate it, but the way water behaves in gravity is almost enough to redeem it. Water in zero gravity is a menace that you have to keep very well separated from any sort of free space, and swimming requires a breather mask. But in gravity it just… sits there. Or flows downhill. it’s pretty amazing, and it enables hot tubs, which are so much better than steam rooms.

Basically what I’m saying is I like hot tubs.

Which may have contributed to the fact that I only made it about one circuit before I decided I was entirely over running. I sent a message to the hot tub to prepare for occupancy and declared myself done with exercise for now when I next passed it.

The hot tubs we have are pretty great. They’re giant pools raised above the base of the ring off to one side of the track. You could easily fit 20 Crew in one, so Kimiko and I were going to be practically lost in it unless anyone else decided to join us.

I stripped off, threw the dirty clothing into the refresher, and showered. Showers are better in zero gravity, but I’ll admit they’re a lot easier in gravity.

Once I was suitably clean I eased myself into the gently steaming water and stopped thinking for a couple hundred seconds.

Eventually I was roused by Kimiko.

“Mind if I join you?”

They sounded amused.

I cracked open an eye and realised I’d drifted out to the center of the pool and was floating splayed out on my back. There was no way for me to take up the whole tub, but I was doing my level best.

I blushed – though between my skin colour and heat it probably didn’t show much – and scrambled to a slightly more civilized position on the ledge at the side of the pool.

“Sorry.”

They waved a hand dismissively.

“Not a problem.”

I looked them up and down as they got into the pool. I hadn’t really being paying attention earlier so I hadn’t noticed but they were really impressively muscular. Not the sort of grotesquely huge muscles you sometimes saw on grounder shows, but way more athletic than I’d ever seen on a Crew member before. I guess that’s what doing extensive calisthenics in high gravity will do to you, but wow.

I wondered why they bothered. It’s not like strength is much use to us.

They sat down on the shelf next to me, leaned into my shoulder slightly and closed their eyes.

We sat in silence for a while. I figured I should give them, and me, some time to relax before we started talking again.

Eventually they opened their eyes.

“So, I was telling you about gender. Still interested?”

“Sure.”

“OK, so, as well as these skin colour based distinctions, most grounder societies also have another way they split people up, which they call gender. Usually there are two genders, sometimes three or four, but the common genders everyone seems to have are men and women.

“That makes sense so far. So what determines your gender?”

“Well it depends on where you are. But basically men have beards and women have big breasts.”

I turned to look at them. Apparently not joking, but it can’t hurt to check.

“You’re kidding?”

“No, honest. Beards and big breasts.”

“That doesn’t make any sense. I’ve seen grounders. Most of them don’t have either, and some of them have both!”

They shrugged.

“Yeah, it doesn’t really make sense. We’re pretty sure about the beards and big breasts thing, but the rest is all a bit blurry. Supposedly it’s meant to be about genitals – the men have a penis and the women have a vulva – but that doesn’t really explain the other genders, and the grounders I’ve met seemed perfectly happy to accept me as a man without checking if I had a penis or not.”

I resolved not to keep asking if they were kidding, but couldn’t resist the urge to check if they had a penis (they didn’t). I changed tack slightly instead.

“But what is gender for exactly?”

“Well it’s all a bit culturally dependent, but it’s primarily a status marker. Men are usually high status.”

“So you have a beard to show you’re… high status?”

I didn’t entirely succeed at keeping the disgust from my voice.

“Gah. No! Definitely no. I just like the beard.”

I relaxed slightly.

“Anyway, even if I wanted to use it that way it wouldn’t work. Most of the Crew learn about beards from Lesbian Space Pirates, where the men are low status.”

I pinched the bridge of my nose and groaned.

“So men may or may not have penises or beards and may or may not be low status but it’s clear to everyone except us which ones are men and we’re not really sure why?”

“Pretty much.”

“This seems like a complete mess. How did people come up with this?”

They shrugged.

“Archaic reproductive methods. If you don’t have uterine replicators, only people with a natural uterus can bear children.”

I winced and clutched my stomach slightly. No thank you.

They continued.

“Between that and the correlated biological differences you’ve got enough of a split to create one of those arbitrary social divides grounders like, which then mutates and changes over time, and tends to stick around even after they get proper reproductive methods. Like the charter says, social structures never die unless you kill them off.”

“This all sounds terrible.”

“Yeah, it’s pretty bad. Grounders, eh? Still, the beard looks good, don’t you think?”

“I’m still getting used to it to be honest.”

“It’ll grow on you.”

“Maybe, but I don’t think I’ll be getting one myself.”

They looked me up and down.

“Not unless you feel like really confusing some grounders, no. Which is always funny, but they get a bit touchy and violent about this one, so maybe best not if you plan to go ground side.”

“If they’re going to declare me low status just because of my breast size I’m not sure I want to!”

“Eh, you’d mostly be protected by being Crew. Grounders are intimidated enough by us that they tend to give us a high status by default, unless you’re in one of the really bad places.”

“Ugh. More status rules? How do you keep track of all this?”

“Oh, we don’t. We just do it in software and tag people with the relevant information. It’s healthier than trying to internalise whatever ridiculous rules the local grounders feel like playing by.”

“Oh. That’s not so bad I guess.”

In fact, it’s more or less exactly how I navigate Crew social conventions. We don’t have status hierarchies like that, but what we do have is just as complicated, and I think I probably have two or three times as much annotation as normal people do.

“Yeah, it’s annoying at first but you get used to it pretty quickly.”

No signs of that after more than a gigasecond so far, sadly.

We lapsed into silence again. I think they picked up that this was a bit of a touchy subject for me.

“So, if you don’t watch Lesbian Space Pirates, what do you watch?”

“Uh, not much. I’m not big on passive media. I play a lot of Evolve instead.”

“Oh, the graph theory game?”

“Well it’s really more of an abstract strategy game where the core playing field is graph theory. I tend to play it through the topological view.”

“Right. What’s the appeal? I tried it for a while but I couldn’t really get into it. It’s very slow moving.”

“That’s the appeal! I know a lot of people who I’m not shift synchronized with, and it’s really good for long play, so it makes a good way of staying in touch across shifts.”

After that the conversation became more casual – the games we played, the people we knew, etc. Nothing of great consequence, just the sort of social maintenance you use when sounding out someone new.

After another kilosecond or so, Kimiko had to go.

“Arthur, it’s been great meeting you, but I think I’m going to go get some food and sleep. Want to join me?”

I thought about it for a moment. I couldn’t tell if it was a genuine invitation or just politeness. I decided to play it safe.

“No thanks, I think I’m going to float around here for a little longer. See you soon, though?”

“Absolutely. We still have to talk about that bug if nothing else.”

We hugged and cheek kissed goodbye – the beard was scratchier than I expected, point against it.

I watched them as they blow dried themself. They really were very muscular, and I hadn’t quite decided yet whether it looked good or ridiculous. It was still impressive, either way.

They finished dressing and putting their hair back up and we waved a final goodbye.

Which left me alone in the hot tub. Honestly, I’d probably been here long enough – I was starting to prune – but I was going to need some food after this and we’re only running one dining room at the moment, so it would be awkward to leave now.

Oh well, more floating time. I pushed back off to the center of the pool and drifted for a while.

All told, I thought that went quite well. Positive first meeting of a fellow crew member. Possibly even a new friend. My HUD certainly thought so and was giving me all sorts of encouraging feedback.

They were a bit odd, granted, but I’m hardly one to point fingers there.

Rather than mull on all of the things I could have done better, I idly flicked through the news feeds to see what else was going on on the ship and whether we’d picked up any interesting new data from the larger network.

After a while, the heat of the pool and the earlier exercise got the better of me, and I drifted off to sleep.


Next chapter: Who wrote this?

This chapter is also mirrored at Archive of Our Own. Updates are currently every other Friday. Patreon backers get access to early drafts of upcoming chapters.

This entry was posted in Fiction, Programmer at Large on by .

Adding a threshold to random ballot

Content warning: Democracy.

I make no secret of the fact that I like random ballot.

One of the first things almost everyone who doesn’t instantly reject the idea suggests is that they’d be more comfortable with random ballot if it had a threshold to keep the really fringe people out – say anyone who would lose their deposit under the current UK system because  they have fewer than 5% of the vote is disqualified and can’t elected. The idea being that although it’s a low-probability event, you’re going to still get a sizable minority of MPs elected with less than 5% of the vote (about \(\frac{1}{20}\) of them in fact, so in the UK around 32.5), which is pretty unfair to those constituents (in a way that apparently getting elected with only 25% of the vote is not).

My major problems with this proposal are:

  1. It distorts the proportionality properties of random ballot
  2. It disenfranchises anyone who votes for someone who failed to reach threshold – which may be a lot more than that fraction of the population if there are multiple minority candidates!
  3. It takes a tactics free voting system and reintroduces something that looks like a pretty classic spoiler vote

How much I care about each of these points varies:

  1. I don’t really have a problem with this as long as the threshold is low enough (and 5% is) – proportional representation is probably better when you restrict the candidates to the subset of the possible representatives that aren’t universally reviled.
  2. This is bad and to be avoided to the greatest degree possible.
  3. Eh, if I’m going to insist on tactics free voting my options are pretty limited. It would be nice to reduce the amount of tactical voting though.

I think I’ve figured out a patch which fixes the second and mitigates the third. The result is a curious hybrid system that is somewhere between random ballot, approval voting and instant runoff voting, but I actually think it works pretty well and, complexity aside, might be strictly better than random ballot for a lot of use cases.

The key idea is this: We make it a ranked system, where your ranked vote is interpreted to mean that you wish your vote to be cast for the highest ranked candidate on your ballot that has not been disqualified for failing to clear the threshold. Voters neither can nor should rank every candidate – the only candidates that appear on their ballots should be ones they’d be happy to vote for.

We then arrange a set of qualified candidates so that every candidate who qualifies has at least a threshold worth of votes cast for them under the above interpretation.

The system works as follows.

Once the ballots have been counted, the threshold is calculated as a number of ballots. This is now fixed and does not get updated throughout the process.

Each candidate is marked as either “uncertain”, “disqualified”, or “eligible”. Every candidate starts as uncertain. Each ballot is marked as either “available”, “withdrawn” or cast”. Every ballot starts as available.

A ballot approves a candidate if the candidate appears on the vote’s ballot. A ballot prefers a candidate if it approves them and every candidate it ranks more highly is disqualified.

Ballot counting now proceeds in a number of rounds until there are no more available votes. A round proceeds as follows:

  1. Every uncertain candidate which is preferred by more than a threshold worth of ballots is marked as eligible.
  2. Every available ballot whose preferred candidate is eligible is marked as cast. This includes candidates marked eligible in previous rounds.
  3. Every uncertain candidate that is approved by fewer than a threshold worth of available ballots is marked as disqualified
  4. If no uncertain candidate has changed its state this round (either by being marked eligible or disqualified), the candidate with the fewest approvals is marked as disqualified. In the event of a tie, the candidate that is preferred by fewer available votes drops out. If that is still a tie, break it arbitrarily.
  5. Every available ballot whose approved candidates have all been disqualified is marked as withdrawn.

At the end, we have a set of eligible candidates and every vote is either cast or withdrawn after they could not collaborate with enough other people to get a candidate they liked elected. Random ballot is now performed with the cast ballots, with the preferred candidate of the selected ballot being elected.

Example

Let’s see how this might play out in practice. We’ll use an unrealistically high threshold of 40% so that we can see how it works properly without requiring a large number of candidates.

Suppose we’ve got four candidates: Alex, Charlie, Kim and Pat. The votes go as follows:

  • 40% vote Alex
  • 25% vote Charlie
  • 20% vote Kim, Charlie
  • 10% vote Charlie, Kim
  • 4% vote Pat, Alex
  • 1% vote Pat

The vote proceeds as follows:

  1. Alex is marked as elected and the 40% go to them.
  2. Pat does not have enough approvals to meet the threshold, so drops out.
  3. The 4% Pat, Alex ballots are now cast for Alex who now has 44%
  4. The 1% Pat voters are marked as withdrawn
  5. Both Kim and Charlie have enough approvals to make the threshold, but neither meets the threshold, so the one with the lowest approvals (which is Kim at 30% vs Charlie’s 55%) drops out.
  6. All remaining available votes prefer Charlie, so Charlie is marked as eligible with 55% of the vote.

So at the end, Alex has 44% of the vote, Charlie has 55%, and the remaining 1% of the votes are wasted. This means that Charlie wins with probability \(\frac{55}{99} = \frac{5}{9} \approx 56%\).

Reasoning

The core idea is to select a set of candidates that you can just use normal random ballot on and have everyone automatically clear the threshold. The election procedure is essentially a greedy algorithm to achieve that.

The motivation behind using approval voting for the selection stage is that immediately dropping out all candidates who aren’t approved by at least of threshold is always the correct thing to do: They can no longer get elected (because there aren’t enough votes that could be cast for them to clear the threshold), so having them stay in the game just leaves them as potential spoilers.

Once you have that, using approval voting to decide dropouts is also the obvious thing to do: The candidate that drops out is the one most likely to drop out later anyway. Approval voting is also a much more stable metric to use for deciding dropouts than preferences.

A simpler version of this system might be to simply mark everyone who gets at least a threshold worth of approvals as eligible. The major reasons I don’t like that is that you can get candidates through who have fewer than a threshold of votes cast for them (which isn’t awful but feels non-ideal) and that it brings in additional tactical elements about whether or not to approve candidates (because if you don’t their votes might go to you).

Evaluation

(Note: This is an informal evaluation. I haven’t actually done simulations to verify this yet)

I think this works pretty well. It mostly works out like random ballot with quality control on which candidates actually run.

The only people who are disenfranchised entirely by this system are the people who so hate any candidate that at least the critical threshold of the voting population can tolerate that they can’t bring themselves to vote for that candidate. Note that this is a much smaller set of people than those who merely prefer every such candidate to any mainstream one. I… probably should still have a problem with disenfranchising these people, but I can’t really bring myself to right now. So I’m calling that a win for point 2.

 

So what does tactical voting look like under this system?

I think it’s pretty mild.

First off, there is never any point in ranking candidates in an order other than your true preferences. The only time that your rank order matters between two candidates is when both of them are eligible, so the usual argument for random ballot holds. So the only question of tactics is who to include on your ballot.

There’s also absolutely no reason to ever omit your favourite candidate from your ballot.

If you’re comfortable that your favourite candidate will be eligible, you can stop there. In general as soon as you’ve ranked a candidate who you’re sure will make threshold there’s no point in going further.

In particular this means that everyone who would under random ballot cast their vote for a candidate who makes it past the threshold of first choices doesn’t need to worry about the rest of the system and can just comfortably cast a single vote as per normal. If every candidate clears the threshold then this system then just reduces to random ballot.

Tactics then consist of essentially jockeying a bit to try to get minority candidates you like who haven’t quite cleared the threshold to get enough votes so that they can be eligible.

There’s no point in omitting a candidate from your ballot when you’re certain they’re going to make threshold on first choice votes before anything else of import has happened. Once a candidate has been made eligible their presence or absence on your ballot has no impact one way or the other. So generally tactical voters will pick a backstop candidate who they’re sure is threshold clearing as their preferred mainstream option if all else fails.

think the same mostly applies for anyone you’re morally certain is going to clear the threshold before anyone you prefer to them. This isn’t perfectly true – it may be in your interests to keep them in the running for longer so you get to keep counting their approvals for candidates you like to stop said candidates dropping out. This is a sufficiently hard to predict move that I don’t expect it to have much practical impact.

So the only real question of tactics is which other fringe candidates you approve. Basically what you want to do is ensure that your lower ranked preferences don’t end up competing with your higher ranked preferences for who gets disqualified.

So, in other words, the easiest tactical vote consists of casting your ballot as follows:

  1. Always rank your favourite candidate first.
  2. Repeatedly add your favourite candidate of the candidates who are “significantly more mainstream” than the ones you’ve added so far.
  3. Keep adding candidates as per above until you’ve added one you are certain will meet threshold.

I would be surprised if many voters bothered to rank more than three candidates – their actual favourite, their second choice who might make threshold, and the mainstreamish candidate (for values of mainstream that mean “have the support of at least 5% of the population”) who they like most.

So essentially the tactical voting creates a bias towards voting for making your subsequent preferences a bit more mainstream than you otherwise would.

But that’s what the system is intended to do anyway. So if that’s genuinely all the tactical bias there is, if anything it’s a feature.

Assuming I’ve not missed some critical flaw under tactical voting, all told I think this is a pretty neat system which nicely balances the proportionality properties of random ballot with a mild mainstream bias.

Applications

This should be applicable most places where random ballot is, assuming the complexity cost is not prohibitive (e.g. you probably don’t want to use this for deciding where to go for lunch).

Where the additional complexity isn’t a problem (e.g. because vote tallying is computerised anyway) I think I might actually just flat out prefer this system to random ballot – it’s a good way to curb some of its variance while still retaining most of its desirable properties.

This is particularly true for cases where random ballot is not currently entirely suitable (because e.g. the stakes are very high and/or the number of repetitions of the vote are quite low) – if you were to push the threshold up to somewhere more in the region of 40% it might even become suitable for more presidential style single winner elections (or, at least, not obviously less suitable than the current systems).

It might also be an interesting procedure for other allocation problems. There’s nothing intrinsic to random ballot for most of it – it’s just a question of how to divide up numbers so that they add up to 100% (in the event that nobody withdraws their vote) and each meet a threshold. You could just as easily use it for e.g. deciding how to distribute money between a number of projects or charities.

This entry was posted in voting on by .