Story concept: Dead man walking

I often sketch out stories in my head, just because it’s something I enjoy doing. Occasionally I turn them into shorts, but I’ve neither the patience nor the talent to really turn them into anything longer so most of the concepts just wither and die.

This seems like a shame, so I thought I might post some of them here as I come up with them. Feel free to take them and run with them if the idea grabs you, though I imagine that’s fairly unlikely as I suspect most people with the will to write are not short of ideas to write about.

Concept:

Protagonist is some sort of research into personal software assistance and/or cybernetics. He has a lot of software which is basically designed to emulate him in mundane tasks – dealing with phone calls, doing drudge research, etc. Over time, both with new software being written and the software developing better models of his behaviour, it gets to the point where the software can fake him to a very convincing degree. Additionally the more he relies on it the more it becomes an integrated part of his personality to the point where it’s difficult to tell exactly where he ends and the software begins (there’s probably no full mind-brain link so much as a very good wearable interface).

At some point he realises that he’s having to deal with a lot of boring face to face stuff while the software is automating a lot of things he’d rather be doing. He takes advantage of latest developments in cybernetic prosthetics for dealing with e.g. spine damage and enables it so that the software is actually fully capable of running his body while his mind wanders. Division blurs further.

Then he dies. Severe brain aneurysm or some such – leaves the body basically intact but brain dead.

…but the software is still happily running and, as per above, is quite capable of controlling the body. Moreover it basically behaves like him in most cases. So despite being brain dead, he is perfectly capable of carrying on as he always did in most circumstances. What does he do now?

Themes:

  • Social issues. Friends and family are going to be severely freaked out – it’s unlikely they had any idea how much of their interactions with protagonist were already just the software, and it will be very hard to figure out how they should treat him now
  • Legal issues. By all medical definitions our protagonist is legally brain dead and unable to make his own decisions. Except that he appears to be walking around, talking and carrying about his daily business
  • Personal issues. The protagonist is having a pretty serious identity crisis. Is he dead? Is he an upload? Is he an AI or merely a really good non-sentient fake? Additionally, he has to cope with the fact that he feels like he’s become stupid. The biological brain was the real creative problem solver – even the software that was designed to cope with doing research and problem solving was really there to do the drudge work while he sorted out the real thing

And that’s about as far as I’ve got with the concept. I like the idea, but I’m not sure how I’d flesh it out into a full story even if I wanted to.

This entry was posted in Fiction on by .

I don’t write Scala

A lot of people follow me on Twitter. I don’t mean Stephen Fry level a lot, but it’s about 8 or 9 times as many people as I follow back.

Based on a (purely visual) random sampling of this, a significant proportion of these are people interested in Scala. That’s fine. I know many such people. I could also be misrepresenting them and they’re just people interested in programming and following me because I’m a programmer (and occasionally even tweet programming related things). I’m not sure, and can’t really be sure without doing a lot more research that I’m actually interested in doing.

So I’m going to assume it’s a Scala thing, because this fits into a general perception issue I’ve noticed people have about me.

You see, I don’t write Scala. I haven’t since late 2009. I didn’t make a big deal about it, because that would have been childish, I just informed a few people in the Scala community I thought should know that I was leaving, along with a few of my reasons why, and then quietly did so. Most of them took it very graciously.

Weirdly, two and some years on, most people seem not to have noticed the complete absence of Scala related content from me. I suspect it’s because I’ve mostly dropped off their radar for one reason or another, so there’s just a vague general impression of me as a Scala person. Hopefully this post should help remove some of that.

To be clear: This is not a normative statement. Just because I don’t write Scala, doesn’t mean you shouldn’t. Scala is pretty neat. I have my reasons to not use it, but don’t wish to explain as I would find the resulting language flamewar extremely tedious. I would greatly appreciate it if you don’t use this post to start one anyway.

This is also not a statement that I hate Scala and will never use it again. I’ve no immediate plans to, but never is a long time. I expect Scala will do quite well, and if it does I expect I will at some point find myself using it again.

To forestall the inevitable question: I am currently mostly writing Ruby at work, and a whole smattering of unrelated things at home (recents include Haskell, Java, Clay, C, a little C++, some Lua…) as the whim takes me. This list is not intended to be prescriptive, and I’m not really interested in the inevitable suggestions for what languages to try next.

TLDR: I write a bunch of languages, mostly Ruby for reasons of circumstance rather than design, but Scala is not numbered amongst them. This is a statement of fact, not a piece of advice.

This entry was posted in Code, life on by .

Really simple spatial indices

It’s something of a truism that when N is small asymptotic complexity doesn’t matter. As a result, for small data sets it’s often the case that doing the dumbest thing that could possibly work is faster than using a really clever data structure or algorithm. For small N insertion sort probably beats quicksort. For small N you may be better off just storing everything in an array of key value pairs and sort on demand rather than using a binary tree. etc.

It is however often the case that you can take this observation and incorporate it into your clever approach by bucketing the base case: You can do insertion sort if your quicksort is over a short range (and indeed this happens recursively), you can make the leaves in your binary tree contain lists and only split them when the leaves get too large, etc.

Similarly when I as playing around with metric search implementations, I found that using a vantage point tree there really wasn’t much point in letting your leaf nodes be of size 1. At some point in the recursion the vantage tree stops being a win and you’re really better off just storing a list of points and doing a distance calculation on each one.

But then I started thinking: Distance calculations are potentially quite expensive. In my test cases they were only expensive-ish (standard euclidean space, albeit high dimensional). In principle if by storing a little bit of extra information along with the list we could save even a few distance calculations on average we’re probably doing quite well.

This article is about some ideas I have in this space. I’ve only tried some of them, and I don’t have sensible benchmarks of any of this, but anecdotally they do seem helpful.

Tracking diameters

By storing a single float, you potentially save yourself a lot of distance computations. The caveat is that this is only helpful in the case where your result is going to be one of two trivial values.

Suppose I have a set of elements A and I know that \(diam(A) = t\). I want to query A for all elements a such that \(d(x, a) \leq e\). We’ll be doing the simple thing of querying every point for its distance.

But we can shortcut. Suppose we calculate \(d(a, x)\). Because we know the diameter, we may be able to use this to do one of two things:

  1. If \(d(a, x) > t + e\) then the triangle inequality gives us that the query completely misses A and the result is empty
  2. If \(d(a, x) + t \leq e\) then every point in A is within e of x and we can just return all of A

Neither of these are terribly likely cases, but they’re very easy to check for and potentially save you a goodly number of queries in the cases where you hit them.

It’s also worth noting that this idea can potentially also be applied further up the tree. If you track the diameter of the current subtree while you’re querying you can potentially eliminate entire subtrees very quickly. I’m unsure however as to whether that is worth the maintenance overhead (it’s a little difficult to figure out where you can backtrack to).

Ring searching

This one can be slightly regarded as a vantage point tree lite (it’s not, but there are similarities).

The idea is this: We pick a distinguished center point, c. I’m not entirely sure what the best way to do this is. Picking at random may be good enough, picking to maximize some spread score such as \( S(c) = Var(\{ d(a, c) : a \in A \}) \) might be better, or to minimize the radius around c. I think experimentation is needed. We now store A in order of increasing \(d(c, a)\), along with those distance values.

Why is this useful?

Well, we can change our query pattern as follows:

First we calculate \(d(x, c)\). We need to do this anyway in order to determine whether c is in the query, so no loss.

Now, we know the radius around c (it’s just the value of the first distance), let’s call it r. So we may be able to shortcut in the same way we used the diameter: We return A if \(d(x, c) + r \leq \epsilon \) or \(\emptyset\) if \(e + r > d(x, c)\).

But that’s a fairly trivial case. It might be useful to just store a center and smallest radius to do that (I don’t know if it is), but it’s certainly not necessary to store all the distances.

So, why store all the distances?

Because you know by the triangle inequality that all hits in a must satisfy

$$d(x, c) – e \leq d(a, c) \leq d(x, c) + e$$

Which gives you a filter that may potentially cut out a lot of the points immediately if e is small compared to r. Further, because we’ve stored the points in sorted order, rather than scanning the whole list and using this as a first pass we can use two binary searches to find the exact interval of interest and just search that.

Conclusion

Like I said, I don’t know if either of these are helpful. I’ve implemented the ring searching as a metric search structure in its own right, and it seems pretty competitive with the vantage point tree on highish dimensional spaces – for a lot of queries you can use the ring to cut down the search space to about 10% or less of the whole thing immediately, then just do a scan on those points. The lower constant overheads seem to be a win. I haven’t tried the diameter based pruning yet, so I don’t know how much it helps, but it would be surprising if it’s not at least a bit of a win.

This entry was posted in programming on by .

And now for something completely different

I write about a lot of things here. Maths, Programming, Voting Theory, Cooking, Fiction and anything else that amuses me. This post however is about something I’ve never written about before (and will probably never write about again). Shaving.

I really hate shaving. I hate shaving with a passion I normally reserve for homeopaths, inconsiderate pedestrians and J2EE.

Unfortunately I also hate having a beard.

By way of compromise I tend to shave only every 4 or 5 days. After a week I’m verging on beard territory, so that’s really too long. But this only accommodates my hatred of shaving, it doesn’t reduce it.

Part of why I hate shaving is how ridiculously gimmicky it is. The shaving companies will go to increasingly elaborate lengths to get you to buy increasingly expensive products. “This razor has five blades! And they’re really small! And it has a built in vibrator, which will totally make your skin happier really!” (Perhaps there needs to be a realrazororparodyrazor site. I sure can’t be bothered to build it though, it would just depress me).

It’s insulting and it’s expensive and, what’s worse, it gives you a really shitty shave.

But there is a way out of this. A secret that Big Shaving has kept from you.

What if I were to tell you that there was a better way? That it is in fact possible to shave without a vibrator. Perhaps even with only one blade? A way to spend 25p instead of £2.25 every time you have to replace one of those damn razor blades. And that you would even get a better shave out of it. (Spoiler: I’m about to tell you that).

There is! Behold the safety razor:

Low budget take on the safety razor

It’s a classic design, one step up from the cutthroat: Most of the quality of shaving, none of the slicing your throat open and sending you to the emergency room. The cheap ones (I use Boots own brand, which I think is the one in the picture above) are typically as good as the expensive ones (because really all it is is a small semi-disposable blade for running across your face. It’s not rocket science: You make it out of decent metal and you make it sharp. That’s it). It gives you a really close shave for a really low price, the blades clean easily and last well, and it probably won’t insult your intelligence when you’re using it (though Gillette do one, so I can’t guarantee it).

It’s not better in every way of course. TANSTAAFL applies, and anyone who tells you otherwise is trying to sell you something (like the idea that you should give them 10x as much money per razor blade). The cost is that you have to be more careful whilst shaving or you’ll get shaving cuts. Not huge gaping wounds, just normal shaving cuts, and they’re completely avoidable once you’ve got the hang of it, but even then shaving is going to be a more methodical process (though not necessarily slower – I find that with a modern razor I have to make more passes in order to get a viable shave).

I wouldn’t go as far as to say to switch to a safety razor has made me not hate shaving, but it’s definitely gone a long way towards it.

Postscripts

  • I also strongly prefer shaving oil to shaving foam or gel. However I think this is less an unambiguous win and more of a personal choice, so my advocacy for it isn’t as strong.
  • It’s worth looking up some guides on shaving (though I can’t recommend any good ones as I did this a while ago and don’t remember what I used). Chances are you’re doing it wrong
  • Sorry, I have no idea whether this is good advice for shaving legs or other parts of your body. It might be. It works well for armpits. Try it and see if you want, but apologies if it turns out to be awful.
This entry was posted in life on by .

Keeping track of the Kth smallest element

This post is brought to you by the school of bloody obvious, but the details seemed just odd enough compared to normal selection that I thought it might be worth mentioning.

For the Java vantage tree thing I found myself needing to do a select smallest k elements when doing a nearest neighbour search. selection algorithms are pretty well covered, but this wasn’t quite that straightforward.

The key thing that made this different is that I wanted to keep track of the largest of the k values I’d found so far, because this meant I could prune entire subtrees from the search immediately rather than descending into them (if it’s obvious there’s nothing closer than the largest value found so far in the subtree then there’s no point in searching it).

So what I wanted was a data structure that supported the following operations:

  • Give me the largest current element in the structure
  • Add an element if there are currently fewer than k elements or if it’s smaller than one of the current elements
  • Give me a list of all the elements in the structure

What turns out to work well here is to just use a binary heap, but rather than using the min-heap you’d expect to use for selecting the smallest elements you use a max-heap. The operations are then implemented as follows:

  • The largest element is just the head of the heap (i.e. the zero index in a heap array).
  • Adding an element is a normal heap add if the heap has fewer than k elements. If the heap has k elements, you check if the element is < the head and if it is replace the head with it and rebalance the heap
  • Getting a list is just a matter of taking everything in the heap

Some Java code implementing this is here, though it doesn’t exactly follow what I’ve described (rather than comparing elements directly it has double scores, it returns infinity as the max element if there are fewer than k elements and it doesn’t heapify until the heap is full).

Not rocket science, but it took me a few minutes to get the details right (I was initially trying to use a min heap and it wasn’t working for all the obvious reasons), so hopefully this post will be useful or interesting to someone.

This entry was posted in programming on by .