David R. MacIver's Blog: (Runtime) Cost-free abstractions in clay

(Runtime) Cost-free abstractions in clay

6 May 2011

My own-time programming habits have been way down recently, which is a bit sad. Much of what I have been doing has been in clay though - partly a project of my own, partly contributing to the standard library (there’s been a cycle of “own project, ooh, this bit could possibly be in the standard library, let’s flesh it out...” resulting in probably more time spent on the latter than on my own).

One of my recent contributions has to been to custom sorting and ordering. My starting point was simple: I wanted to be able to sort by things which were not the natural order (in my particular use case I had a scoring function and wanted to sort things from “best” to “worst”).

I started out in the obvious way, and just let it take an argument that defined a custom < for the type - essentially just a function comparator argument. This was fine, and works perfectly naturally in clay (yes, clay is a systems language, but it’s got a perfectly reasonable lambda system. The capturing rules are a little odd, but for things where you don’t have to capture state it Just Works).

I then took a leaf out of Haskell’s book and defined a function comparing. In Haskell speak, comparing has the signature comparing :: (Ord b) => (a -> b) -> (a -> a -> Bool). (I’m not 100% sure this is the signature in Haskell, but it’s the equivalent signature to what I defined).

This all worked fine. But I found it a little annoyingly specific. What if I wanted to use <= in the sort? Ok, sure, you can do x <= y as not(x > y), but that’s a little awkward. And then what if I wanted to do ==? Again, you can define it in terms of <=, but that’s a bit rubbish.

Thus was born the comparators API. It’s very straightforward. Essentially the idea is that all of the standard functions that correspond to the comparison operators get overloaded to take a three-argument version where the first argument is a “Comparator” type (in clay this currently just means that Comparator?(T) returns true. There’s no true type-class system, only compile-time predicates which test types).

Having the comparator types, it was natural to define combinators for them. So far we’ve only got the basic ones (comparing, natural, and reversed), but it would be highly reasonable to define more.

But, I asked, you really want these operations to be fast. e.g. a sort lives or dies on how fast its comparisons are. How much overhead does using the comparators API introduce?

Err, none, it turns out. At least if you use the standard comparators with procedures for arguments (and most of the time when you use lambdas too). I’m afraid I’ve lost the examples, but I did some experimenting, and it turns out that with things like reversed(comparing(size)) and similar, clay actually just gives this a unique type which requires no data to represent, and type specialisation and inlining take over and the comparator literally completely disappears at runtime: I compared assembly output from using the comparator and writing the same code by hand, and they were literally byte identical.

Most languages (yes guys, I know you can do this in C++, or with a whole program optimising compiler like mlton) when you introduce an abstraction like this, the result would have a performance hit. Not a lot in many cases, but maybe you’re passing a dictionary at runtime, or introduce boxing overhead, or even just have to pass a token around which makes function calls slightly more expensive. Here you can introduce the abstraction, which can clean your code up a lot, without any runtime difference from the specialised code you would have written by hand. That’s pretty cool.