Author Archives: david

The world has gone mad

Here is the proof:

vv.getRenderContext().setVertexLabelTransformer(MapTransformer.<Number,String>getInstance(
LazyMap.<Number,String>decorate(new HashMap<Number,String>(), new ToStringLabeller<Number>())));

Update: I’ve found a better example.

This entry was posted in programming and tagged , on by .

Data structures in Java

I’ve been playing with implementing some simple data structures in Java, for later use in the lazy strings project. The results are… a bit depressing, frankly. Clearly I need to learn to optimise better, as right now I’m creating specialised data structures for certain types which turn out to perform worse than the general purpose implementations in the standard library. :-)

In particular, I threw together a simple implementation of a Trie, and it performs significantly worse than just using a TreeMap. As the number of collisions increases the relative performance of the Trie improves somewhat, eventually beating the TreeMap, but still in most situations the performance is pretty poor.

It also doesn’t help that the Trie ends up bloating the keys significantly, as you end up having to create far too many objects per character. I can definitely cut back on the object creation, but even one object per character may prove to be too much to get an efficient implementation in Java.

This entry was posted in programming and tagged , , on by .

Playing with Arrows

Apologies in advance for the quality of the html blogger seems to be generating. I probably need to find a better solution (like using lhs and attaching pdfs or something).

In the recent discussion about run length encoding, one thing people focused on was the use of the &&& operator and how noone understands arrows so &&& is scary and arcane.

I feel perfectly placed to comment on this, because I don’t understand arrows either. :-) I cheerfully abuse the arrow operators for their function instance without really worrying about that, and it seems to work fine.

So, this is just a rundown of what the arrow operations do for the function instance, in order to make them seem less scary. I might follow up with looking at Kleisli arrows later (which are arrows that arise from monads in a natural way).

The arrow class is defined as follows:

class Arrow a where
arr :: (b -> c) -> a b c
pure :: (b -> c) -> a b c
(>>>) :: a b c -> a c d -> a b d
first :: a b c -> a (b, d) (c, d)
second :: a b c -> a (d, b) (d, c)
(***) :: a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: a b c -> a b c' -> a b (c, c')

The first two are totally uninteresting for functions (they’re just the identity). So if we cut those out and specialise the signatures to functions we get the following:

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
first :: (b -> c) -> (b, d) -> (c, d)
second :: (b -> c) -> (d, b) -> (d, c)
(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(&&&) :: (b -> c) -> (b -> c') -> b -> (c, c')

i.e. these are basically all operators for manipulating combinations of pairs and functions. Most of these you can probably infer the purpose of from their types and names, but I’ll go through them anyway. We’ve already encountered &&&, and it’s the one I use the most, so I’ll go through these in reverse order.

The &&& operator is in fact very simple, but it’s the arrow operator I use the most. Consider the following:

> (+1) &&& (+2) $ 3
(4, 5)

i.e. f &&& g when applied to x just evaluates both and returns them as a pairs of both results.

We could define this as:

(&&&) :: (a -> b) -> (a -> c) -> (a -> (b, c))
(f &&& g) x = (f x, g x)

For example, in the run length encoding article we had the function head &&& length:

> head &&& length $ [1, 3]
(1,2)

In a similar usage to the RLE article, I’ve often found the following sort of trick useful:

map (head &&& length) . group . sort

This counts frequencies of elements in a list.

> map (head &&& length) . group . sort $ ["foo", "bar", "foo", "baz"]
[("bar",1),("baz",1),("foo",2)]

Now ***.

> (+1) *** (+2) $ (3, 4)
(4, 6)

i.e. f *** g applied to (x, y) returns (f x, g y). Or, in a more readable form:

(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(f *** g) (x, y) = (f x, g y)

I don’t really have an obvious use case for this one. I’m sure they exist though.

I’m going to pick up the speed now, as I’m getting bored and so you probably are too. :-)

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g (f x)

Or in other words, >>> is just reverse function composition. So we could have written this as

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g . f

Or

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(>>>) = flip (.)

Now for first and second:

first :: (b -> c) -> (b, d) -> (c, d)
first f (x, y) = (f x, y)

second :: (b -> c) -> (d, b) -> (d, c)
second f (x, y) = (x, f y)

i.e. These just take a function and apply it to the first or second entry of a tuple, leaving the other unchanged.

Those are all the core arrow operations. There are a few derived operations, but the only one which is at all interesting for functions is <<<, which is just function composition. Another Arrow related class which functions are an instance of is ArrowChoice. ArrowChoice does for Either what Arrow does for pairs (blah blah, category theory, blah, (,) and Either are dual, blah). Here's the instance declaration:

class Arrow a => ArrowChoice a where
left :: a b c -> a (Either b d) (Either c d)
right :: a b c -> a (Either d b) (Either d c)
(+++) :: a b c -> a b’ c’ -> a (Either b b’) (Either c c’)
(|||) :: a b d -> a c d -> a (Either b c) d

Specialised to functions:

left :: (b -> c) -> (Either b d) -> (Either c d)
right :: (b -> c) -> (Either d b) -> (Either d c)
(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d

Lets jump straight to writing out some definitions:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f (Left x) = Left $ f x
left _ (Right y) = Right y

right :: (b -> c) -> (Either d b) -> (Either d c)
right f (Right x) = Right $ f x
right _ (Left y) = Left y

(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(f +++ g) (Left x)  = Left $ f x
(f +++ g) (Right x) = Right $ g x

(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d
(f ||| g) (Left x)  = f x
(f ||| g) (Right x) = g x

i.e. left f applies f to the left option of an Either and leaves the right option alone.

> left (+1) $ Left 2
Left 3

> left (+1) $ Right 2
Right 2

right does the reverse.

f +++ g applies f to the left option and g to the right:

> (+1) +++ (+2) $ Left 2
Left 3

> (+1) +++ (+2) $ Right 2
Right 4

So we could have implemented left and right (and indeed, this is how they’re actually implemented in the real instance declaration) as follows:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f = f +++ id

right :: (b -> c) -> (Either d b) -> (Either d c)
right f = id +++ f 

Or even as (+++id) and (id+++).

Finally, (f ||| g) x is basically just a combinator for case matching, applying f if we have a Left value, g if we have a Right. It’s just the standard ‘either’ function but it can be nice to have it available as an operator.

> (+1) ||| (+2) $ Left 2
3

> (+1) ||| (+2) $ Right 2
4

So, there you have it. Arrow operations. Not too scary, and now you have a bunch of new combinators to play with. I doubt it will revolutionise your Haskell code, but every now and then they allow for a really neat solution you wouldn’t otherwise have thought of. Have fun.

This entry was posted in programming and tagged , , on by .

Perversity

When you’re programatically creating an abstract syntax tree in order to generate a text representation of a regular expression in order to parse it into an abstract syntax stree in order to compile it to a matcher object which reads your sequential list in a random access manner (causing you to have to fake random access behaviour with a mutable index into your list) in order to iterate over it sequentially…

…at this point the idea that someone, somewhere, might be doing something wrong starts to sink in.

This entry was posted in programming and tagged , on by .

Where lies the problem?

The intersection between my facebook friends and people who read this blog is probably (relatively) large, but for completeness and historical record I’ll post my current facebook status:

“David is considering at what point he should start to wonder whether the problem might not be with the rest of the world.”

I get into a lot of arguments, or at least heated disagreements. Sometimes it’s because I’m wrong, but most of the time it feels like the world has reflexively thought “Oh, David has an opinion. Let’s disagree with it!”

This is very frustrating. Especially when I’m clearly right. Or, at the very least, when those I am arguing with are mouthing gibbering nonsense and ignoring my refutations of it.

This isn’t new either. It happens in whatever area I’m working. More so in programming than it did in maths, but I think that’s a function of which side of the discipline I interact with more than it is of the discipline itself.

Example points on which I’ve had arguments in ##java.

  • First class functions are a good thing.
  • Java’s type system is crap.
  • Random access strings are a poor design choice for an immutable data structure.
  • The non-trivial import features Java provides (* and static imports) are actually a good thing.
  • JEE is bloated nonsense (ok, only about half of ##java disagrees with me on this point).
  • Large scale ORM is a stupid idea, and causes the most amazing amount of grief.
  • XML may be standard, but it’s still a rubbish format and should be avoided at all costs (*waves to the RSS readers*).
  • Most recently (and inspired by this) the minor suggestion that being allowed to override void methods with non-void ones would be useful (response: Garbage arguments as to why this would be horrificially wrong and unsafe. It’s not.).
  • Endless stylistic arguments that basically boil down to “Why the Java standard approach of writing procedural code and calling it object oriented because it lives in a class is a bad one”.
  • Endless philosophical arguments that basically boil down to “Why ‘That’s not OO’ is the most useless criticism of an approach in the history of programming”.

As per the earlier quote, eventually I have to start wondering whether the reason I disagree with everyone might just not be because everyone else is wrong.

But then I come to my senses.

Firstly, these are mostly Java programmers. There are certainly competent Java programmers, but the average Java programmer is the very definition of blub. “Oh noes. It’s an anonymous function? What’s a function? How do I write that with a for loop???”. Even the good ones tend to be very set in their ways. Admittedly some of these arguments are with people who I’d otherwise have believed to be competent. However I’m going to perpetrate a no true scotsman fallacy^H make a highly reasoned distinction: If they were competent, they wouldn’t be spouting nonsense and then ignoring a detailed explanation of why they were wrong.

Secondly, the state of the the world in general and of technology in particular gives me irrefutable empirical evidence that the vast majority of people are idiots. It therefore stands to reason that a sizable proportion of the people I interact with are idiots. Not coincidentally, I get into arguments with a sizable proportion of the people I interact with.

So, in conclusion, I am correct. The problem is not with me, it is with the rest of the world. I should stop letting the opinions of idiots bother me.

By the way, this is not a deliberately over the top rant. You may be tempted to think that I’m being tongue in cheek or making some metaphorical or otherwise not entirely serious point with this post. I’m not. Deal with it.

This entry was posted in programming and tagged , , on by .