Ordering and mapping regular languages

I’ve had the following question for a while: How do I create a mapping of keys to values where the keys are regular expressions, and two regular expressions are considered equivalent if they correspond to the same language?

An example of why you might want to do this is e.g. when constructing a minimal deterministic finite automaton for a regular language you end up labelling states by regular expressions that represent the language matched when starting from that state. In order for the automaton to be minimal you need to have any two equivalent regular expressions correspond to the same state, so you need a way of taking a new regular expression and finding out which state it should go to.

It’s easy (if potentially expensive) to test regular expression equivalence once you know how, so the naive implementation of this is just to do a linear scan of all the regular expressions you’ve found so far. It’s O(n) lookup but it’s at least an existence proof.

In the past I’ve ended up implementing a crude hash function for regular languages and then just used a hash table. It works, but collisions are common unless you’re quite clever with your hashing, so it doesn’t work well.

But it turns out that there is a better way! Rather than using hashed data structures you can used ordered ones, because it turns out that there is a natural and easy to compute (or at least not substantially harder than testing equivalence) total ordering over the set of regular languages.

That way is this: If you have two regular languages \(L\) and \(M\) that are not equivalent, there is some \(x \in L \triangle M\), the symmetric difference. That is, we can find and \(x\) which is in one but not the other. Let \(x\) be the shortlex minimal such word (i.e. the lexicographically first word amongst those of minimal length). Then \(L < M\) if \(x \in L\), else \(M < L\).

The work in the previous post on regular language equivalence is thus enough to calculate the shortlex minimal element of an inequivalent pair of languages (though I still don’t know if the faster of the two algorithms gives the minimal one. But you can use the fast algorithm for equivalence checking and then the slightly slower algorithm to get a minimal refutation), so we can readily compute this ordering between two regular expressions. This, combined with any sort of ordered collection type (e.g. a balanced binary tree of some sort) gives us our desired mapping.

But why does this definition provide a total order?

Well, consider the enumeration of all words in increasing shortlex order as \(w_0, \ldots, w_n, \ldots\). Let \(l_n = 1\) if \(w_n \in L\), else \(l_n = 0\). Define \(m_n\) similarly for \(M\).

Then the above definition is equivalent to the reverse of the lexicographical ordering between \(l\) and \(m\)! If \(w_k\) is the smallest word in the symmetric difference then \(k\) is the first index at which \(l\) and \(m\) differ. If  \(w_k \in L\) then \(l_k = 1\) and \(m_k = 0\), so \(l > k\), and vice versa. The lexicographical order is a total order, and the reverse of a total order is a total order, so the above definition is also a total order.

This definition has a number of nice properties:

  • Any language containing the empty word sorts before any language that doesn’t
  • The function \(L \to \overline{L}\) is order reversing.
  • \(L \cap M \leq L, M \leq L \cup M\)

I originally thought that union was coordinate-wise monotonic, but it’s not. Suppose we have four words \(a < b < c < d\), and consider the languages \(L = \{a, d\}, M = \{b, c\}\). Then \(L < M\) because the deciding value is \(a\). But now consider \(P = \{a, b\}\). Then \(L \cup P > M \cup P\) because the deciding element now ends up being \(c\), which is in \(M\).

I’ve yet to try this in practice, so it might turn out that there are some interestingly pathological failure modes for this comparison function, but I don’t think there are likely to be any that aren’t also present in testing regular expression equivalence itself.

Another open question here is which sorted map data structure to use? The comparison is relatively expensive, so it might be worth putting a bit of extra effort in to balance it. As such an AVL tree might be a fairly reasonable choice. I’m not sure.


Want more blog posts like this? Join the 30 others who are supporting my writing on Patreon!

This entry was posted in Automata Theory, programming on by .

An epistemic vicious circle

Let’s start with apology: This blog post will not contain any concrete examples of what I want to talk about. Please don’t ask me to give examples. I will also moderate out any concrete examples in the comments. Sorry.

Hopefully the reasons for this will become clear and you can fill in the blanks with examples from your own experience.

There’s a pattern I’ve been noticing for a while, but it happens that three separate examples of it came up recently (only one of which involved me directly).

Suppose there are two groups. Let’s call them the Eagles and the Rattlers. Suppose further that the two groups are roughly evenly split.

Now suppose there is some action, or fact, on which people disagree. Let’s call them blue and orange.

One thing is clear: If you are a Rattler, you prefer orange.

If you are an Eagle however, opinions are somewhat divided. Maybe due to differing values, or different experiences, or differing levels of having thought about the problem. It doesn’t matter. All that matters is that there is a split of opinions, and it doesn’t skew too heavily orange. Let’s say it’s 50/50 to start off with.

Now, suppose you encounter someone you don’t know and they are advocating for orange. What do you assume?

Well, it’s pretty likely that they’re a Rattler, right? 100% of Rattlers like orange, and 50% of Eagles do, so there’s a two thirds chance that a randomly picked orange advocate will be Rattler. Bayes’ theorem in action, but most people are capable of doing this one intuitively.

And thus if you happen to be an Eagle who likes orange, you have to put in extra effort every time the subject comes up to demonstrate that. It’ll usually work – the evidence against you isn’t that strong – but sometimes you’ll run into someone who feels really strongly about the blue/orange divide and be unable to convince them that you want orange for purely virtuous reasons. Even when it’s not that bad it adds extra friction to the interaction.

And that means that if you don’t care that much about the blue/orange split you’ll just… stop talking about it. It’s not worth the extra effort, so when the subject comes up you’ll just smile and nod or change it.

Which, of course, brings down the percentage of Eagles you hear advocating for orange.

So now if you encounter an orange advocate they’re more likely to be Rattler. Say 70% chance.

Which in turn raises the amount of effort required to demonstrate that you, the virtuous orange advocate, are not in fact Rattler. Which raises the threshold of how much you have to care about the issue, which reduces the fraction of Eagles who talk in favour of orange, which raises the chance that an orange advocate is actually Rattler, etc.

The result is that when the other side is united on an issue and your side is divided, you effectively mostly cede an option to the other side: Eventually the evidence that someone advocating for that option is a Rattler is so overwhelming that only weird niche people who have some particularly strong reason for advocating for orange despite being an Eagle will continue to argue the cause.

And they’re weird and niche, so we don’t mind ostracising them and considering them honorary Rattlers (the real Rattlers hate them too of course, because they still look like Eagles by some other criteria).

As you can probably infer from the fact that I’m writing this post, I think this scenario is bad.

It’s bad for a number of reasons, but one very simple reason dominates for me: Sometimes Rattlers are right (usually, but not always, for the wrong reasons).

I think this most often happens when the groups are divided on some value where Eagles care strongly about it, but Rattlers don’t care about that value either way, and vice versa. Thus the disagreement between Rattler and Eagles is of a fundamentally different character: Blue is obviously detrimental to the Rattlers’ values, so they’re in favour of orange. Meanwhile the Eagles have a legitimate disagreement not over whether those values are good, but over the empirical claim of whether blue or orange will be better according to those values.

Reality is complicated, and complex systems behave in non-obvious ways. Often the obviously virtuous action has unfortunate perverse side effects that you didn’t anticipate. If you have ceded the ground to your opponent before you discover those side effects, you have now bound your hands and are unable to take what now turns out to be the correct path because only a Rattler would suggest that.

I do not have a good suggestion for how to solve this problem, except maybe to spend less time having conversations about controversial subjects with people whose virtue you are unsure of and to treat those you do have with more charity. A secondary implication of this suggestion is to spend less time on Twitter.

But I do think a good start is to be aware of this problem, notice when it is starting to happen, and explicitly call it out and point out that this is an issue that Eagles can disagree on. It won’t solve the problem of ground already ceded, but it will at least help to stop things getting worse.


Like my writing? Why not support me on Patreon! Especially if you want more posts like this one, because I mostly don’t write this sort of thing any more if I can possibly help it, but might start doing so again given a nudge from patrons.

A worked example of designing an unusual data structure

Due to reasons, I found myself in need of a data structure supporting a slightly unusual combination of operations. Developing it involved a fairly straightforward process of refinement, and a number of interesting tricks, so I thought it might be informative to people to walk through (a somewhat stylised version of) the design.

The data structure is a particular type of random sampler, starting from a shared array of values (possibly containing duplicates). Values are hashable and comparable for equality.

It needs to support the following operations:

  1. Initialise from a random number generator and a shared immutable array of values so that it holds all those values.
  2. Sample an element uniformly at random from the remaining values, or raise an error if there are none.
  3. Unconditionally (i.e. without checking whether it’s present) remove all instances of a value from the list.

The actual data structure I want is a bit more complicated than that, but those are enough to demonstrate the basic principles.

What’s surprising is that you can do all of these operations in amortised O(1). This includes the initialisation from a list of n values!

The idea behind designing this is to start with the most natural data structure that doesn’t achieve these bounds and then try to refine it until it does. That data structure is a resizable array. You can sample uniformly by just picking an index into the array. You can delete by doing a scan and deleting the first element that is equal to the value. This means you have to be able to mutate the array, so initalising it requires copying.

Which means it’s time for some code.

Let’s start by writing some code.

First lets write a test suite for this data structure:

from collections import Counter
 
from hypothesis.stateful import RuleBasedStateMachine, rule, precondition
import hypothesis.strategies as st
 
from sampler import Sampler
 
 
class FakeRandom(object):
    def __init__(self, data):
        self.__data = data
 
    def randint(self, m, n):
        return self.__data.draw(st.integers(m, n), label="randint(%d, %d)" % (
            m, n
        ))
 
 
class SamplerRules(RuleBasedStateMachine):
    def __init__(self):
        super(SamplerRules, self).__init__()
        self.__initialized = False
 
    @precondition(lambda self: not self.__initialized)
    @rule(
        values=st.lists(st.integers()).map(tuple),
        data=st.data()
    )
    def initialize(self, values, data):
        self.__initial_values = values
        self.__sampler = Sampler(values, FakeRandom(data))
        self.__counts = Counter(values)
        self.__initialized = True
 
    @precondition(lambda self: self.__initialized)
    @rule()
    def sample(self):
        if sum(self.__counts.values()) != 0:
            v = self.__sampler.sample()
            assert self.__counts[v] != 0
        else:
            try:
                self.__sampler.sample()
                assert False, "Should have raised"
            except IndexError:
                pass
 
    @precondition(lambda self: self.__initialized)
    @rule(data=st.data())
    def remove(self, data):
        v = data.draw(st.sampled_from(self.__initial_values))
        self.__sampler.remove(v)
        self.__counts[v] = 0
 
TestSampler = SamplerRules.TestCase

This uses Hypothesis’s rule based stateful testing to completely describe the valid range of behaviour of the data structure. There are a number of interesting and possibly non-obvious details in there, but this is a data structures post rather than a Hypothesis post, so I’m just going to gloss over them and invite you to peruse the tests in more detail at your leisure if you’re interested.

Now lets look at an implementation of this, using the approach described above:

class Sampler(object):
    def __init__(self, values, random):
        self.__values = list(values)
        self.__random = random
 
    def sample(self):
        if not self.__values:
            raise IndexError("Cannot sample from empty list")
        i = self.__random.randint(0, len(self.__values) - 1)
        return self.__values[i]
 
    def remove(self, value):
        self.__values = [v for v in self.__values if v != value]

The test suite passes, so we’ve successfully implemented the operations (or our bugs are too subtle for Hypothesis to find in a couple seconds). Hurrah.

But we’ve not really achieved our goals: Sampling is O(1), sure, but remove and initialisation are both O(n). How can we fix that?

The idea is to incrementally patch up this data structure by finding the things that make it O(n) and seeing if we can defer the cost for each element until we actually definitely need to incur that cost to get the correct result.

Let’s start by fixing removal.

The first key observation is that if we don’t care about the order of values in a list (which we don’t because we only access it through random sampling), we can remove the element present at an index in O(1) by popping the element that is at the end of the list and overwriting the index with that value (if it wasn’t the last index). This is the approach normally taken if you want to implement random sampling without replacement, but in our use case we’ve separated removal from sampling so it’s not quite so easy.

The problem is that we don’t know where (or even if) the value we want to delete is in our array, so we still have to do an O(n) scan to find it.

One solution to this problem (which is an entirely valid one) is to have a mapping of values to the indexes they are found in. This is a little tricky to get right with duplicates, but it’s an entirely workable solution. It makes it much harder to do our O(1) initialize later though, so we’ll not go down this route.

Instead the idea is to defer the deletion until we know of an index for it, which we can do during our sampling! We keep a count of how many times a value has been deleted and, if we end up sampling it and the count is non-zero, we remove it from the list and decrement the count by one.

This means that we potentially pay an additional O(deletions) cost each time we sample, but these costs are “queued up” from the previous delete calls, and once incurred do not repeat, so this doesn’t break our claim of O(1) amortised time – the costs we pay on sampling are just one-off deferred costs from earlier.

Here’s some code:

class Sampler(object):
    def __init__(self, values, random):
        self.__values = list(values)
        self.__random = random
        self.__deletions = set()
 
    def sample(self):
        while True:
            if not self.__values:
                raise IndexError("Cannot sample from empty list")
            i = self.__random.randint(0, len(self.__values) - 1)
            v = self.__values[i]
            if v in self.__deletions:
                replacement = self.__values.pop()
                if i != len(self.__values):
                    self.__values[i] = replacement
            else:
                return v
 
    def remove(self, value):
        self.__deletions.add(value)

So now we’re almost done. All we have to do is figure out some way to create a mutable version of our immutable list in O(1).

This sounds impossible but turns out to be surprisingly easy.

The idea is to create a mask in front of our immutable sequence, which tracks a length and a mapping of indices to values. Whenever we write to the mutable “copy” we write to the mask. Whenever we read from the copy, we first check that it’s in bounds and if it is we read from the mask. If the index is not present in the mask we read from the original sequence.

The result is essentially a sort of very fine grained copy on write – we never have to look at the whole sequence, only the bits that we are reading from, so we never have to do O(n) work.

Here’s some code:

from collections import Counter
 
 
class Sampler(object):
    def __init__(self, values, random):
        self.__values = values
 
        self.__length = len(values)
        self.__mask = {}
        self.__random = random
        self.__deletions = set()
 
    def sample(self):
        while True:
            if not self.__length:
                raise IndexError("Cannot sample from empty list")
            i = self.__random.randint(0, self.__length - 1)
            try:
                v = self.__mask[i]
            except KeyError:
                v = self.__values[i]
            if v in  self.__deletions:
                j = self.__length - 1
                try:
                    replacement = self.__mask.pop(j)
                except KeyError:
                    replacement = self.__values[j]
                self.__length = j
                if i != j:
                    self.__mask[i] = replacement
            else:
                return v
 
    def remove(self, value):
        self.__deletions.add(value)

And that’s it, we’re done!

There are more optimisations we could do here – e.g. the masking trick is relatively expensive, so it might make sense to switch back to a mutable array once we’ve masked off the entirety of the array, e.g. using a representation akin to the pypy dict implementation and throwing away the hash table part when the value array is of full length.

But that would only improve the constants (you can’t get better than O(1) asymptotically!), so I’d be loathe to take on the increased complexity until I saw a real world workload where this was the bottleneck (which I’m expecting to at some point if this idea bears fruit, but I don’t yet know if it will). We’ve got the asymptotics I wanted, so lets stop there while the implementation is fairly simple.

I’ve yet to actually use this in practice, but I’m still really pleased with the design of this thing.  Starting from a fairly naive initial implementation, we’ve used some fairly generic tricks to patch up what started out as O(n) costs and turn them O(1). As well as everything dropping out nicely, a lot of these techniques are probably reusable for other things (the masking trick in particular is highly generic).

Update 09/4/2017: An earlier version of this claimed that this solution allowed you to remove a single instance of a value from the list. I’ve updated it to a version that removes all values from a list, due to a comment below correctly pointing out that that approach biases the distribution. Fortunately for me in my original use case the values are all distinct anyway so the distinction doesn’t matter, but I’ve now updated the post and the code to remove all instances of the value from the list.


Do you like data structures? Of course you do! Who doesn’t like data structures? Would you like more data structures? Naturally. So why not sign up for my Patreon and tell me so, so you can get more exciting blog posts like this! You’ll get access to drafts of upcoming blog posts, a slightly increased blogging rate from me, and the warm fuzzy feeling of supporting someone whose writing you enjoy.

This entry was posted in programming, Python on by .

Programmer at Large: Didn’t you notice?

This is the latest chapter in my web serial, Programmer at Large. The first chapter is here and you can read the whole archives here or on the Archive of Our Own mirror. This chapter is also mirrored at Archive of Our Own.


As usual, I paused at the entry to the common area to scan the room.

It was incredibly quiet. Apparently I’d got out of sync with the normal meal rhythm of the ship again. Oh well, too bad.

Sam was there, along with someone else my HUD told me was named Brian. Apparently I’d met them twice before, but I had no memory of that at all. My notes for them said “Likes to talk about game theory”.

Sam waved a greeting at me as I passed. I waved back, and considered my options. Registered eccentricity or not, ignoring them and eating on my own would just be rude. Sam probably wouldn’t mind – they know about my eccentricity and are generally a forgiving sort – but particularly with someone else involved it would look bad and is best avoided.

I grabbed a meal pouch – extra protein, as per the system’s suggestion – sighed, and bounced back to join the two of them.

Brian was a pretty typical looking crew member. Short – slightly shorter than me, even – somewhat curvy, skin maybe a bit darker than average. They were currently doing a rather impressive fractal braid that I was pretty sure my hair wasn’t thick enough to pull off, but I made a note to see if I could steal some of the geometries. Sam was fully bald as usual.

“Hey Sam, hey Brian.”

“Hey Arthur! Brian was just telling me about why they’ve been hogging all that simulator time.”

“Oh?” I inquired politely while glancing at the contextual cue from our last conversation to remind myself what that was about.

“Yeah, apparently some of the information that we’d got on the broadcast has given them a breakthrough on some really exciting stuff. Brian, do you want to explain it or do you mind if I do to see if I understand it properly?”

Brian waved their hand. “Go on, as long as you don’t mind me jumping in if I think you’re off track.”

“OK, great. Arthur, how’s your game theory?”

I smiled. Looks like my notes were reliable at least.

“Eh, rusty at best. I’ve taken all the standard courses on it of course, but plumbing systems don’t try to outsmart you so I’m pretty out of practice.”

“OK. Do you remember what a trembling hand perfect equilibrium is?”

I blinked and called up the definition to refresh my memory, then nodded.

“Vaguely.”

“OK, so…”

The explanation went on for a long time. I asked some questions, got some explanations, and think in the end I got maybe a third of it.

My very naive summary of it is this: Apparently some pure research that came in on the broadcast when we were in the last star system had applications after all. It established slightly tighter bounds on a classic convergence result in continuous time Markov chains. This in turn lead to a new strategy for dealing with boycotts of hostile planets, because in certain circumstances you could force a suboptimal equilibrium into a better state.

If the results of the simulations panned out in practice, this would reduce expected total length of boycotts by almost 1%. As well as saving a lot of lives, it had the more important effect that if it doesn’t disturb the standard model too heavily it should bring the best estimated likelihood of success of the coreward expansion from 99.72% to 99.73%. This would be extremely exciting, as that would be the first significant improvement in that number in several tens of gigaseconds of lineage time.

I’m sure the details were very clever, and Sam and Brian seemed very excited about them, but it was too far out of my field for me to really get more than the gist of it and, honestly, I didn’t really care enough to try that hard.

Eventually the explanation wound down and we hit a lull in the conversation. Brian broke it.

“So, uh, Arthur.”

They had a strange look on their face which I couldn’t interpret. I checked my HUD and apparently they were embarrassed. Odd.

“Yes?” I said in my best tone of polite interest.

“I noticed from your status that you’ve become friends with Kimiko recently…”

That sounded ominous.

“I guess so? We’ve talked a bit and they seem nice. Why?”

“Well, um, did you notice anything off about them?”

“What, you mean the uh-” check cue, and vaguely gesture at my face to cover the hesitation – “beard? Sure, it was a bit strange, but it seemed harmless.”

They waved their hand dismissively.

“No, no, beards are fine. There was a big trend for them a while ago before people got bored. But surely you noticed their social centrality markers?”

I shrugged.

I have a lot more status information configured than most people, and I’m not really that interested in the social games given how badly I do in them, so the relevant statistics tend to get crowded out.

I called them up and noted with some surprised that they were even worse than me. Curious. They seemed friendly enough.

“Oh, huh. That’s weird. No I hadn’t noticed that before.”

Sam rolled their eyes at me.

“Arthur you really do need to be better at paying attention to these things.” they said.

“Sorry. So, uh, what’s up with the low centrality? I don’t see any markers on their file to explain it.”

Brian looked even more embarrassed.

“Well, uh, you see. They have sex.”

I blinked. I knew people cared about that, but I wasn’t entirely clear on why. Anyway we didn’t ostracise people for it did we? That couldn’t be healthy. A lot of people experimented once or twice, so that sounded like a great way to create social division.

“OK? And?”

“What? You don’t care?”

“Well, sure, it’s gross, but it’s not like they’re having sex with me, right? They haven’t asked and I’d just say no if they did. Unless you’re suggesting…”

They looked horrified at the idea.

“No, no. They wouldn’t still be around on the ship if they did that. Just, you know, sex. A lot of it apparently.”

“OK. I don’t see the problem then?”

Was I starting to sound annoyed? I think I was probably starting to sound annoyed.

The look on their face changed. HUD said probable sudden realisation.

“Oh, sorry, do you uh…?”

“What? No! Yuck. I just don’t see the point in worrying about aspects about someone else’s private life that don’t affect me and are explicitly kept behind a privacy screen by charter!”

I was starting to get alerts that this was a bad social interaction and that we were making Sam very uncomfortable, but I had no idea how to deescalate when I didn’t start this mess in the first place.

We glared at each other for a little while while I tried to figure it out, but in the end they were the one who backed down.

“Ugh, fine. Be like that. I’m sorry I brought it up.”

“That’s OK. I’m sure you meant well.”

I’m reasonably sure they didn’t need any sort of HUD notification to notice the lie.

“Anyway, I’d better get back to work. Good to see you, Sam. Uh, nice to meet you again, Arthur.”

And, for once, I didn’t either.

We both cheek kissed Brian goodbye, mine rather more perfunctory than Sam’s, and they pushed off the wall and made a rapid exit.

Sam turned to me with a pained smile.

“I think that went well, don’t you?”


Next chapter will be in two weeks (April 21st).

Like this? Why not support me on Patreon! It’s good for the soul, and will result in you seeing chapters as they’re written (which is currently not very much earlier than they’re published, but hopefully I’ll get back on track soon).

This entry was posted in Fiction, Programmer at Large on by .

My time: Now available to the highest bidder

This is an experiment.

Would you like me to do some work for you? Specifically, one day worth of work?

Well, that option is of course always open to you and you can email me to talk about it. But maybe you’ve been worried that I’m too busy, too expensive, etc. Or maybe you just haven’t got around to it.

So I’m going to try an experiment: I am auctioning off one work day of my time next month to the highest bidder. It can be for anything you like – Hypothesis, writing, having someone come in to take a look at your business and give you some pointers on your software development, or really anything else at all. This can be more or less whatever you want as long as it’s something I’m in principle willing to do for money (and if it’s paying me to do a concrete thing like write some software or an article rather than general consulting, I’m happy for this to include rights assignment).

Why am I doing this? The prompting thought was that I’m looking into ways to fill precisely one day of my time per month once I start doing my PhD, which will hopefully be in September – PhD stipends are a lot lower than developer salaries or day rates, so a top up would be nice, but I don’t want to do my PhD part time.

But once I thought about it I realised that it just seemed like a really fun experiment to try, so I figured I might as well try it now! Worst case scenario, we all learn something interesting and the cost of a day of my time.

Here are the rules:

  1. You are getting one day of work. If I think what you have asked for cannot be done in one day I will tell you up front, but if it takes longer than expected then I’m not going to continue working on it for free. I recommend picking things that either are intrinsically time bounded or can be stopped at any time and still be useful, but small projects that I’m confident can be done in a day are OK too.
  2. The auction will be a Vickrey auction. That is, the highest bidder wins the auction and pays the second highest bidder’s bid. That means that you will usually pay less than your bid (And also means that the game theoretically correct amount for you to bid is exactly the amount my time is worth to you).
  3. If there is only one bidder then that means they get a day of my work for free.
  4. The auction will run until midnight BST (UTC + 1) on April 30th. No bids made after that point will be counted.
  5. If there is a tie for top place then I will pick the one I most want to do among them (or flip a coin or something if I’m torn) and charge the full bid.
  6. If I judge a job to be illegal, unethical, or I just really don’t want to do it for any reason then I will remove it from the list and not count it for either the top or the second place bid. I will let you know if I do this if you would have been the first or second place bid and I don’t think you’re being deliberately abusive.
  7. I will publish both the winning bid and what they actually paid (and, if it is acceptable to the winner, who won) once the auction completes.
  8. The day of work must be redeemed at some point in May, after which I will send you an invoice. If we can’t come to a mutually acceptable time in May I will rerun the auction without your bid and offer the spot to the new winner and you won’t get charged anything.
  9. If the job cannot be done remotely I will also invoice you for travel (and, if necessary, accommodation) on top of the bid.

Note that yes you can bid ridiculously small amounts, or even zero, and if that’s the winning bid then I will honour it (though I will probably not repeat the experiment for very long if that happens). You can also bid ridiculously high amounts and hope that the next best bid isn’t ridiculous, but I don’t recommend it (and may query your willingness to pay before accepting…). The winning strategy really is to bid what you think my time is worth to you.

Interested? Just fill out this form! I will let you know if you win the bid, and we can then arrange dates accordingly.

If you’re still on the fence, here are some ideas for how a day of my time can be useful to you:

  1. I can write a short article (or story if you prefer!) about something I already know a reasonable amount about. Very few of my blog posts take more than a day of work.
  2. I can read a paper and discuss it with you and/or try to write a short precis and introduction to its ideas (whether this is feasible will vary based on the paper, but I can get a pretty good idea if it’s going to be with a brief skim).
  3. I can add tests using Hypothesis to a small Python project.
  4. If working with someone who knows that project well, I can add tests using Hypothesis to a larger Python project.
  5. I can come in to your company and answer peoples’ questions on a subject of your choosing, which will probably be Hypothesis, but could be Integer Linear Programming, Voting Theory, or general software development.
  6. I can come in to your company and take a look at your development practices and processes and offer you tips on how to improve them.
  7. I can come in to your company and help with improving your interviewing process.

Those are just some ideas though! The point is that I have a variety of expertise, a lot of which can be usefully applied in under a day if you pick a reasonably small project or just use me as a general consultant.

Note that if you do have any general desire to bring me in as a consultant you should put in a bid, because you will almost certainly get a very good deal on this auction compared to my normal rates.

This entry was posted in programming, Python, Work on by .