Tag Archives: haskell programming

Playing with Arrows

Apologies in advance for the quality of the html blogger seems to be generating. I probably need to find a better solution (like using lhs and attaching pdfs or something).

In the recent discussion about run length encoding, one thing people focused on was the use of the &&& operator and how noone understands arrows so &&& is scary and arcane.

I feel perfectly placed to comment on this, because I don’t understand arrows either. :-) I cheerfully abuse the arrow operators for their function instance without really worrying about that, and it seems to work fine.

So, this is just a rundown of what the arrow operations do for the function instance, in order to make them seem less scary. I might follow up with looking at Kleisli arrows later (which are arrows that arise from monads in a natural way).

The arrow class is defined as follows:

class Arrow a where
arr :: (b -> c) -> a b c
pure :: (b -> c) -> a b c
(>>>) :: a b c -> a c d -> a b d
first :: a b c -> a (b, d) (c, d)
second :: a b c -> a (d, b) (d, c)
(***) :: a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: a b c -> a b c' -> a b (c, c')

The first two are totally uninteresting for functions (they’re just the identity). So if we cut those out and specialise the signatures to functions we get the following:

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
first :: (b -> c) -> (b, d) -> (c, d)
second :: (b -> c) -> (d, b) -> (d, c)
(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(&&&) :: (b -> c) -> (b -> c') -> b -> (c, c')

i.e. these are basically all operators for manipulating combinations of pairs and functions. Most of these you can probably infer the purpose of from their types and names, but I’ll go through them anyway. We’ve already encountered &&&, and it’s the one I use the most, so I’ll go through these in reverse order.

The &&& operator is in fact very simple, but it’s the arrow operator I use the most. Consider the following:

> (+1) &&& (+2) $ 3
(4, 5)

i.e. f &&& g when applied to x just evaluates both and returns them as a pairs of both results.

We could define this as:

(&&&) :: (a -> b) -> (a -> c) -> (a -> (b, c))
(f &&& g) x = (f x, g x)

For example, in the run length encoding article we had the function head &&& length:

> head &&& length $ [1, 3]

In a similar usage to the RLE article, I’ve often found the following sort of trick useful:

map (head &&& length) . group . sort

This counts frequencies of elements in a list.

> map (head &&& length) . group . sort $ ["foo", "bar", "foo", "baz"]

Now ***.

> (+1) *** (+2) $ (3, 4)
(4, 6)

i.e. f *** g applied to (x, y) returns (f x, g y). Or, in a more readable form:

(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(f *** g) (x, y) = (f x, g y)

I don’t really have an obvious use case for this one. I’m sure they exist though.

I’m going to pick up the speed now, as I’m getting bored and so you probably are too. :-)

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g (f x)

Or in other words, >>> is just reverse function composition. So we could have written this as

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g . f


(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(>>>) = flip (.)

Now for first and second:

first :: (b -> c) -> (b, d) -> (c, d)
first f (x, y) = (f x, y)

second :: (b -> c) -> (d, b) -> (d, c)
second f (x, y) = (x, f y)

i.e. These just take a function and apply it to the first or second entry of a tuple, leaving the other unchanged.

Those are all the core arrow operations. There are a few derived operations, but the only one which is at all interesting for functions is <<<, which is just function composition. Another Arrow related class which functions are an instance of is ArrowChoice. ArrowChoice does for Either what Arrow does for pairs (blah blah, category theory, blah, (,) and Either are dual, blah). Here's the instance declaration:

class Arrow a => ArrowChoice a where
left :: a b c -> a (Either b d) (Either c d)
right :: a b c -> a (Either d b) (Either d c)
(+++) :: a b c -> a b’ c’ -> a (Either b b’) (Either c c’)
(|||) :: a b d -> a c d -> a (Either b c) d

Specialised to functions:

left :: (b -> c) -> (Either b d) -> (Either c d)
right :: (b -> c) -> (Either d b) -> (Either d c)
(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d

Lets jump straight to writing out some definitions:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f (Left x) = Left $ f x
left _ (Right y) = Right y

right :: (b -> c) -> (Either d b) -> (Either d c)
right f (Right x) = Right $ f x
right _ (Left y) = Left y

(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(f +++ g) (Left x)  = Left $ f x
(f +++ g) (Right x) = Right $ g x

(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d
(f ||| g) (Left x)  = f x
(f ||| g) (Right x) = g x

i.e. left f applies f to the left option of an Either and leaves the right option alone.

> left (+1) $ Left 2
Left 3

> left (+1) $ Right 2
Right 2

right does the reverse.

f +++ g applies f to the left option and g to the right:

> (+1) +++ (+2) $ Left 2
Left 3

> (+1) +++ (+2) $ Right 2
Right 4

So we could have implemented left and right (and indeed, this is how they’re actually implemented in the real instance declaration) as follows:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f = f +++ id

right :: (b -> c) -> (Either d b) -> (Either d c)
right f = id +++ f 

Or even as (+++id) and (id+++).

Finally, (f ||| g) x is basically just a combinator for case matching, applying f if we have a Left value, g if we have a Right. It’s just the standard ‘either’ function but it can be nice to have it available as an operator.

> (+1) ||| (+2) $ Left 2

> (+1) ||| (+2) $ Right 2

So, there you have it. Arrow operations. Not too scary, and now you have a bunch of new combinators to play with. I doubt it will revolutionise your Haskell code, but every now and then they allow for a really neat solution you wouldn’t otherwise have thought of. Have fun.

This entry was posted in programming and tagged , , on by .

What’s a monad?

It’s almost traditional that people learning Haskell should write their own version of ‘An introduction to monads’. I think it serves to teach the writer more than the reader, but that’s fine. I’ve understood them in a way which I’ve not seen covered in the existing introductions, so I thought I’d get it down on ‘paper’. Note that the point of this post is to demystify more than it is to enlighten – if you don’t already, you probably won’t fully understand monads by the end, but you hopefully will be much closer to the point where you can.

Java has the foreach loop:

for(Foo foo : foos) doStuff(foo);

Fundamentally unexciting, but a nice bit of syntactic sugar.

A common argument for adding ‘closures’ (really first class functions) to Java is that if they’d added them in in 1.5 they wouldn’t have needed to add the forEach loop because it could be defined as a method. Here’s an example in Scala:

package Demo;

object ControlFlow {
    def forEach[T] (iter : Iterator[T], action : T => Unit) : Unit = 
        while(iter.hasNext) action(iter.next); 

    def forEach[T] (iter : Iterable[T], action : T => Unit) : Unit = forEach (iter.elements, action);

Still fundamentally unexciting, right? It’s just yet another loop. Except… it’s not quite, is it? I’m going to give this its own line just to make sure the point is clear:

When we introduced first class functions to the language, we gained the ability to define our own control flow mechanisms.

This is Exciting.

Scala introduces another neat concept, sequence comprehensions:

package Demo;

object ComprehensionTest
    def main (args : Array[String])={
        val bar = for {
            val arg <- args;
            val arg2 <- args;
            !arg.equals(arg2) }
            yield (arg, arg2);

What does this do? Well, it constructs an iterable object consisting of all pairs of command line arguments, omitting repeated pairs. So

> scala Demo.ComprehensionTest foo bar

List((foo,bar), (bar,foo))

We could do much the same thing with nested for loops, but it wouldn't be as nice. For very involved collection manipulation, comprehensions simplify life a lot, so their addition to Scala is a great boon.

But we could have defined something very similar ourself.

Here's a rewrite that doesn't use comprehensions:

object NoComprehension
    def guard[T](test : Boolean, value : T) = 
        if (test) 
            new ::(value, Nil)
    def main (args : Array[String])={
        val bar = args.flatMap(
            (arg : String) => 
                    (arg2 : String) => 
                        guard(!arg.equals(arg2), (arg, arg2))))


In fact, these compile to very similar things. Scala would use filter where I defined and used guard. I used the guard because a) I think it's clearer and b) It supports my point. :-)

So, what's going on here?

The following is the definition of flatMap in the Iterable[A] class definition (See here).

def flatMap [B](f : (A) => Iterable[B]) : Collection[B]
Applies the given function f to each element of this iterable, then concatenates the results.

So, let's look at the inner example first.

args.flatMap ((arg2 : String) => guard(!arg.equals(arg2), (arg, arg2)))

The anonymous function takes arg2 and returns a List, which is either [(arg, arg2)] or []. It then concatenates these lists together. So this has the effect of simultaneously pairing up the (arg, arg2) values and filtering out all elements for which the two are equal

So for each value of arg we have a list of the right (arg, arg2) pairs we want. We now flatMap this over all of args, and get the full list we want.

Easy, right?

The higher order functions approach is much more flexible, but the comprehension syntax is a lot more readable (and concise). Especially if you come from an imperative background. How do we make the two meet?

Now let's rewrite these examples in Haskell. First the higher order function one:

import Monad
import System

main = do{
    args <- getArgs;
    print $ distinctPairs args;

distinctPairs :: [String] -> [(String, String)]
distinctPairs args =
    args >>=
        \arg -> 
            args >>=
                \arg2 -> 
                    guard (arg /= arg2) >>
                    return (arg, arg2)

This looks almost identical to the Scala one, once you get over superficial differences in syntax.

In particular we replace the method foo.flatMap(bar) with the operator foo >>= bar. It does exactly the same thing (well, the type signatures are a bit different, but in this instance it does exactly the same thing).

The 'guard' method is a little different, as we're using Haskell's built in function of that name.

This is basically what it does:

guard :: Bool -> [()]
guard True = [()]
guard False = []

(This is again not really correct. It's a correct definition in this instance, but the real definition is more general).

What >> does is:

(>>) :: [a] -> [b] -> [b]
foo >> bar = foo >>= (\_ -> bar)

You may find this a bit confusing, so I'll unpack the definition with a quick reminder.

\_ -> bar is an anonymous function which takes anything and returns bar. So foo >>= bar concatenates together one copy of bar for each element of foo. i.e. it's length foo copies of bar joined together.

In particular guard test >> bar is either [] if test is False or bar if test is true (as guard has length 0 in the first case and 1 in the second).

return is very simple. return x = [x]

So, guard test >> return x is the same as guard(test, x) in our Scala method.

Still with me?

Now, how do we write this so it looks like a Scala comprehension?

import Monad
import System

main :: IO ()
main = do{
    args <- getArgs;
    print $ distinctPairs args;

distinctPairs :: [String] -> [(String, String)]
distinctPairs args = do{
    arg  <- args;
    arg2 <- args;
    guard(arg /= arg2);
    return (arg, arg2)

Looks almost exactly like the Scala version, doesn't it?

At this point you might feel cheated if you've not seen 'do' notation before. "So... the point of this article is that Scala has these cool things called sequence comprehensions, and Haskell has them too? Who cares??".

Now, look up there a little bit. For convenience, I'll repeat it here:

Main :: IO ()
main = do{
    args <- getArgs;
    print $ distinctPairs args;

What's that got to do with sequence comprehensions?

Well, nothing.

It turns out that this set of operations '>>=', '>>' and 'return' is so useful that Haskell has bundled them into their own type class, called Monad. So these apply to any type in this type class, including both List and IO, as well as many others. You can then apply do notation to an monad, and it just gets converted into a use of these operations in more or less the same way that we went from the list comprehension to the higher order functions. It works like this:

do { foo } = foo, for foo an instance of the monad.
do { foo; bar; baz } = foo >> do { bar; baz }
do { myFoo <- foo; bar; baz } = foo >>= myFoo -> do{bar; baz} 

(note that the last one puts myFoo in scope for bar and baz, so we can use it in their definitions exactly like we'd expect).

Why bother giving this special treatment to monads? Well, it's the same reason as the foreach loop was introduced - they crop up *everywhere*. It turns out that (for reasons I won't go into here) you can realise the most outrageous range of programming idioms as instances of Monad. But doing so gives you somewhat clunky syntax, so the do notation exists to make that nicer. That's all it is.

So, what's a monad? Nothing special. It's a type class which gets some preferential treatment from the language because of its ubiquity. It contains some standard operations which map quite well onto common forms of control flow, so it tends to crop up quite a lot. That's all.

This entry was posted in programming and tagged , on by .