Category Archives: programming

So I accidentally wrote a monad tutorial

Put this up in a gist the other day. Recording it here for posterity. It’s probably not actually very helpful for understanding monads:

> module MonadTutorial where

Oh god it’s a monad tutorial.

(Why am I even writing this? It’s not like I use Haskell much at all these days)

I’m going to start by not talking about monads at all. Instead I’m going to give you a quick introduction to how to read Haskell and how Haskell type classes work. If you already understand that bit, feel free to skim until you start to
feel confused.

First off: This is literate Haskell. Every line starting with > is code. Everything else is plain text. You can run it through a Haskell compiler if you want.

Now. Here’s a type signature:

>blah :: Int -> Int -> Int

It says “There is a function called blah. It takes two Ints and returns an Int”.

Well, almost. What it actually says is “There is a function called blah. It takes an Int and returns a function which takes an Int and returns an Int”. Haskell tends to adopt a convention of using partial application for multiple
arguments. You mostly won’t need to worry about the difference here.

Here’s a definition for our function blah:

>blah x y = x + y

(not very exciting)

Type signatures (and data types) can also be polymorphic, that is they can work for a range of input types:

>reverse’ :: [a] -> [a]
>reverse’ [] = []
>reverse’ (x : xs) = reverse’ xs ++ [x]

(this is a stupid implementation. Don’t worry about that)

In Haskell type signatures any lower case symbol is in fact a type variable, which may be bound to any type. So you can have reverse applied to [Int] and it will produce an [Int], or [String] and it will produce a [String], etc.

The second thing to note in this code is pattern matching. If you don’t understand it, don’t worry about it too much, but this is basically how we define a function over lists.

So, we have polymorphic functions, which is all very well, but really the only thing you can do with a thing of an unknown type is pass it around. Sometimes you want to be able to say “I have a type and I don’t know what it is, but I can do these things to it”. That’s where type classes come in.

A type class is a way of writing an interface that a type must satisfy. Individual types can then be instances of this type class (note: Instances are types, not values) by implementing that interface.

Another important thing to note is that this is the only way Haskell supports overloading. There is no ad hoc overloading based on type or number of arguments – it’s type classes or bust.

Here’s a very simple type class:

>class Zeroed a where
> zero :: a

>instance Zeroed Int where
> zero = 0

This basically says “In order to be Zeroed a type ‘a’ must have a distinguished element called zero”. We’ve then made Int implement this in the obvious way.

There’s not much interesting we can do with this type class, but here’s an example using it:

>andZero :: (Zeroed a) => a -> (a, a)
>andZero x = (x, zero)

This is a polymorphic function where we’ve constrained the type variable a to implement the Zeroed class. All it does is return a pair of elements, the first being the element passed in, the second being the zero of the appropriate type.

One thing to note is that we’ve overloaded on the type of zero here. Because we’ve specified that the function returns an (a, a) it knows that the zero must belong to the same type the argument, so we get the right value of zero as defined in the type class.

We could also define a Zeroed implementation for lists. e.g.

>instance Zeroed [a] where
> zero = []

So we would have andZero 10 == (10, 0) and andZero “kittens” == (“kittens”, “”) (a string in Haskell is just a list of characters).

You can also extend type classes, and you can define functions in them as well as values:

>class (Zeroed a) => Group a where
> add :: a -> a -> a
> negate :: a -> a

>instance Group Int where
> add x y = x + y
> negate x = -x

So a Group (don’t worry about why it’s called that) is something with a zero in which you can add pairs of elements and negate individual elements.

It’s common to require implementations to satisfy additional invariants which we don’t enforce in code. These are just documentation things which you have to
prove and/or test for yourself, but your implementation is “wrong” if you don’t satisfy them.

Example rules for group:

add x (negate x) = zero add x (add y z) = add (add x y) z

It’s simply convention that these must hold, but we will often assume it for writing functions that depend on the class.

For example we could define the following generic function depending on the class:

>sum’ :: (Group a) => [a] -> a
>sum’ [] = zero
>sum’ (x : xs) = add x (sum’ xs)

(Yes, I know about fold. I don’t care to explain it here)

And somewhere later we might find ourselves wanting to assume that addList (x ++ y) == add (addList x) (addList y). We can do that because of the second of the rules we’ve imposed. (If this isn’t obvious, don’t worry about why).

There are two more things we need to talk about before we get on to monads.

Types in Haskell can depend on other types. For example you have lists. Lists of integers are one type ([Int]), lists of strings are another type ([String]. Which is [[Char]], but you don’t need to know that). List is a type constructor rather than a type – you feed it types, you get new types out.

You can have type classes for type constructors too.

>class Stacky s where
> push :: a -> s a -> s a
> pop :: s a -> (Maybe a, s a)

>instance Stacky [] where
> push x s = x : s
> pop [] = (Nothing, [])
> pop (y:ys) = (Just y, ys)

So a type constructor s is stacky if you can push an a to an s a and pop an a off an s a. So lists are stacky, but so are vectors, and so might other types be.

One final thing. This should be obvious, but just as a reminder… Functions are values in Haskell. So you can include higher order functions in type classes.

Here’s an example:

>class Mappable s where
> map’ :: (a -> b) -> s a -> s b

(Note: In Haskell this is normally called Functor and map’ is normally called fmap. Don’t worry about that for now)

What does this mean? It means that given a function which takes things of one type, you can lift it to some “container” type constructor. e.g.

>instance Mappable [] where
> map’ f [] = []
> map’ f (x : xs) = (f x) : (map’ f xs)

But there are also other examples where you can do this. The following one will require you to stop and think for a moment, so don’t worry if it’s not obvious:

>instance Mappable ((->) input) where
> map’ f g = \x -> f(g x)

What on earth is going on here?

The slightly obtuse type constructor notation might confuse you. (->) is a type constructor that produces the function type. It takes two arguments. What we’re doing is partially applying the constructor to some fixed type “input”. This gives us a type constructor which takes one argument – call it output – and produces functions input -> output.

We’re effectively treating functions as containers containing their output values. So in this case “mapping” is just composition – you map the values by applying another function to them. If this still doesn’t make sense, imagine how this works in the Indexed case if you treat a vector as a function from its indices to its elements.

Do try to understand this example. It will help later.

Ok. Take a deep breath, maybe go get a cup of tea. Basically, relax. I’m about to give you a type class definition for monads, and you need to believe that it’s not going to scare you that much.

>class (Mappable m) => Monad’ m where
> unit :: a -> m a
> join :: m (m a) -> m a

Yep, really. That’s it. It isn’t quite the one you may have seen, but it really is equivalent.

So every Monad is something Mappable. We can construct a monad over some type from values of that type. And we’ve got that slightly mysterious operation join. What’s it do?

Well, at the moment we’ve not constrained anything about how it behaves, so it could do whatever you want!

But let’s pick an example instance that might be helpful.

>instance Monad’ [] where
> unit x = [x]
> join = concat

So we build lists from elements by just taking the singleton list, and we join a list of lists to a list by concatenating it.

What are some reasonable ways for this to behave?

Well, here’s one:

join (unit x) == x

This is obvious with lists: concat [x] == x

But it’s also reasonable in the sense that it means that our monad building and collapsing rules are somehow inverses.

I’m not going to go into the other monad axioms right now (mainly because I can’t translate them off the top of my head to this notation, but I think they do all have sensible translations). It’s not that important to understand them just now.

What I am going to go into is how this relates to IO.

You remember how we constructed the mappable instance for ((->) a)? We can do something pretty similar to construct a Monad’ instance for it.

>instance Monad’ ((->) key) where
> unit x = \y -> x
> join f t = (f t) t

Ok. I’ll admit it. The join implementation looks pretty obtuse. What are we doing there?

What we’re actually doing is the most obvious thing that could possibly work.

We’ve got our value f which is an m (m b). That is, it is a key -> key -> b. Given an instance of key, which we’ve called t in the implementation, we want to produce a b. We have f, and all we can do is apply it to things of type key. So we do that, and we get a key -> b. All we can do with that is apply it to things of type key and, oh look, we have one right there. So we do that, we have a b, and the job is done.

Ok. So there’s a variety of monads.

Now, we’ll consider the IO monad. How is an IO like a list I hear you ask?

Well, it’s not very like a list. But it is quite a lot like ((->) ()).

“What on earth is that type?”, you ask? It’s the type construtor for functions which take an argument of type () (a type with only one value, also written (). I’m not sure what it’s called in Haskell, but I’ve seen it called Unit elsewhere).

So we’re talking about functions () -> a, for a variety of types a. In Haskell this isn’t very interesting, because all the functions are pure and all the values are lazy, so really there’s very little difference bettween a value of type a and a value of type () -> a.

But imagine we had some magic special annotation which said “And this function is allowed to perform side effects”. That is essentially what an IO a is. It is
a block of code that if run would maybe produce side effects and would then emit
a value a at the end.

We can’t define this type in Haskell, but if we could it would look something like this:

>data IO’ a

>instance Mappable IO’ where
> map’ f x = undefined

Run x with its side effects, then apply f to the result

>instance Monad’ IO’ where
> unit x = undefined

Just return the value x without performing any side effects

> join f = undefined

Run f. Get a new side effecting block which returns an a. Run that and return the result

When you are writing a Haskell program which does IO, has side effects, etc you are not “really” writing a program which has effects. Instead you are writing a purely functional program which evaluates to a result that when run by the Haskell runtime produces some effects. That value has type IO. (Yes, this sounds suspect to me too, but let’s go with it anyway). Those values thus obey the rules for a monad (I haven’t told you what they are yet, but lets forget that for now).

Ok. So I’ve talked you through some examples of a monad, and the operations it obeys. Why do we care again?

Partly it’s because we can write generic functions which take any monad and do useful things with it (this does happen, but I’m somewhat skeptical of how big a deal it is). Mostly though it’s because monads capture an important notion, that of sequencing.

Suppose we want to capture the notion of “A side effecting function u -> v”. You would write this as u -> IO v. So it’s a function from a which produces a side-effecting thing that would produce a v. Suppose can also produce a u with side-effects. How would we produce a v from these two things?

In fact. Let’s make this concrete. Suppose we had:

>getsLine’ :: IO’ String
>getsLine’ = undefined

>putsLine’ :: String -> IO’ ()
>putsLine’ = undefined

i.e. a function for reading a line from STDIN and a function for writing a line to STDOUT.

How do we combine these to write

>catLine :: IO’ ()

which reads one line from STDIN and writes it to STDOUT?

Really the types tell us the only thing we could possibly do. We start with an IO’ String. We then map putsLine’ across it to get an IO’ (IO’ ()). We then join to get an IO’ ().

>catLine = join((map’ putsLine’) getsLine’)

And because the types basically forced our hand here, this is good evidence of a more general sequencing operator:

>bind :: (Monad’ m) => m a -> (a -> m b) -> m b
>bind m f = join(map’ f m)

bind is more commonly introduced as one of the primitive monad operators, and it’s usually written >>= for improved readability (straight face).

In our IO interpretation what this means is “Run m. Get a value. Feed that value into f. Run the result. Return the value from that”

What does bind look like in our list interpretation?
Well, first of all we’re mapping each element to get a list. Then we’re concatenating it. This is sometimes calledFlatMap.

For example, bind [1, 2, 3] (\x -> [x, x+1]) == [1,2,2,3,3,4]

These behaviours don’t look like they have much in common. I have an intuitive way of looking at it that shows why they’re the same thing, but do bear in mind that it’s only a hand wavy intuition. This isn’t what a monad *is*. What a monad is is a set of operations and axioms they must obey. This is just a hand-wavy way of looking at it that might help:

Monads are things that may emit some number of values (which may be 0) when you “do something” to them. What that “do something” means is entirely dependent on the specific monad. In IO, it’s running the block of code. In a list it’s iteration. Further, they have enough structure to give you ways of sequencing monads together to create new monads of the same type.

So in this interpretation, the operations mean:

unit: Just emit this value map: Take the emissions from this monad, feed it through this function, emit the result of that.

bind: Take the emissions from this monad, each of those are themselves monads, take the emissions from those.

If you now think through the list and IO examples, hopefully you’ll see that these informal definitions line up with what we’re doing there.

There’s more stuff I could write about this, but I’m not going to. Instead I’d recommend digesting this, making sure you’ve understood it, and then head over to the wiki entry on the actual monad laws for your next dose of axiomatic monads and how this relates to the do notation you’ve probably seen.

This entry was posted in programming on by .

Saturday night madness

This weekend seems to by my last weekend without a party this year, so I’ve been taking advantage of it to write code.

(21:49:32) David: I decided to write C instead of being social. It’s almost the same thing
(22:01:05) m.: haha
(22:01:12) m.: you mean eye-gougingly awful?
(22:04:20) David: I *like* writing C.
(22:04:34) m.: sick cunt
(22:05:17) David: It’s one of those things where it’s nice as a break from high level ruby bullshit
(22:06:53) David: Ruby: “Yes I know this looks like you’re setting a field but actually you’re invoking method missing which mixes in a module that accesses a database so it’s really doing a network round trip except when you’re using the caching layer of course”
C: “Yes I know this looks like you’re setting a field but YOU’RE NOT BECAUSE YOU’RE A MORON WHO DIDN’T INITIALIZE YOUR MEMORY AND OH LOOK A SEGFAULT ARE YOU HAPPY NOW MOTHERFUCKER?”
(22:07:01) David: A change is as good as a rest and all that

Behold my rock and roll lifestyle.

This entry was posted in life, programming on by .

Sieving out prime factorizations

For a project Euler problem I had an algorithm the first step of which was to find the prime factorization of the number you fed it. This was by far the slowest part of the operation (mostly because the rest of it was pretty heavily cached to a much smaller number of solutions).

I was running this function on the numbers from 1 to 10^8, and it occurred to me that there must be a simpler way to precompute the prime factorizations of all of these without doing the full factorization for each. After some thought I came up with the following solution. It’s in some sense the most obvious thing you can possibly do (at least given a basic knowledge of some of the algorithms around this), but I’d not seen it before and knowing it exists might save you some time in similar scenarios, so I thought I’d post it here.

It’s essentially a modified Sieve of Eratosthenes. The difference is what you store in the sieve and what you do in the crossing off operation. There’s also an additional step.

The idea is this: We maintain a list of prime factors found so far for each number.

We then loop over the numbers in order from 2 upwards. If when we reach a number its list of prime factors is empty then it’s a prime, and we start marking off as follows:

Call the current number p. First we mark off all multiples of p and add p to their list. Then we mark off all multiples of p^2 and add p again. Then p^3, etc. until p^k > n.

Then we’re done, and move on to the next number.

That’s really all there is to it.

Here’s some python code that implements it:

def prime_factorizations(n):
  sieve = [[] for x in xrange(0, n)]
  for i in xrange(2, n):
    if not sieve[i]:
      q = i
      while q < n:
        for r in xrange(q, n, q):
          sieve[r].append(i)
        q *= i
  return sieve

Here it is in action:

>>> prime_factorizations(15)
[[], [], [2], [3], [2, 2], [5], [2, 3], [7], [2, 2, 2], [3, 3], [2, 5], [11], [2, 2, 3], [13], [2, 7]]

Edit: In the comments, Rich points out that you can make this more space efficient by instead storing a single prime for each integer: Specifically, the smallest prime that divides it. You can then reconstruct the desired list by iterated division – if the integer stored is equal to the current index, it’s a prime. If not, it contains one of its prime factors. Then divide by that factor and recurse. This might be a bit slower but is a lot more space efficient. Here’s some more python demonstrating this:

def prime_factorizations(n):
  sieve = [0 for x in xrange(0, n)]
  for i in xrange(2, n):
    if not sieve[i]:
      sieve[i] = i
      for r in xrange(r, n, r):
        sieve[r] = sieve[r] or i
  return sieve
 
def factor_list_from_sieve(sieve, n):
  results = []
  while n > 1:
    p = sieve[n]
    results.append(p)
    if p == n: break
    n = n / p
  return results

The first method returns the described array of primes, the second reconstructs the prime factor list from it.

Example:

>>> sieve = prime_factorizations(1000)
>>> factor_list_from_sieve(sieve, 992)
[2, 2, 2, 2, 2, 31]
This entry was posted in Numbers are hard, programming on by .

Randomly exploring github

I made a thing!

If you go to githubexplorer.com you will get a little toy for visiting random Github repos.

It doesn’t just select them at random. It picks a user, then picks a random repo they like. As a result, popular and interesting repos should occur more often than uninteresting ones.

Right now it’s very basic. I have some additional plans for using it to test bed some ideas I have for UIs for randomly exploring things.

(Also right now it’s a little crashy. Having a problem with the way it’s deployed which I’ve yet to get to the bottom of. Just refresh the page and it should be fine)

This entry was posted in programming on by .

Against Human Readability, Part 2 of Many: The toolchain argument

One of the biggest promoted arguments for using text based formats (which is not the same as human readable of course, but is a prerequisite) is that you can use your existing unix toolchain on them.

I am skeptical.

You can of course use your text editor on them. Rather by definition. But being able to use the other tools limits you to very specific sorts of text formats – in particular ones which are quite heavily line oriented. The thing is, the unix tool chain is actually a very specific tool. It works very well for things which are Unix shaped and quite terribly for anything else. You thus end up needing tools for converting to and from unix shaped formats if you want to be able to use your tool chain on your data. Witness jsonpipe as an example of this sort of shenanigans. And if you’re going to be converting it into and out of a different textual format, why do you need your source format to be text based at all?

This still leaves you with your text editor.

I think if you’re going to be designing a binary format, I think you really need some sort of pretty printer for it. In our discussion in #againstreadability, Andy and I had the following conversation:

21:19 < andyjpb> if you have a pretty printer you have to do most of the hardwork anyway ;-)
21:19 <@DRMacIver> Not so! 
21:19 <@DRMacIver> Generators are much easier than parsers
21:19 <@DRMacIver> Because you don't have to worry about the ambiguity introduced 
21:19 < andyjpb> well, I suppose it depends on whether you have a reader or not.. but once you've got a printer, a reader is not so far away
21:19 <@DRMacIver> Yeah, but don't do that :)
21:20 < andyjpb> whynot? modifying bits in a blob is a solid usecase

I think I’ve changed my mind. If you really find you need to be editing the format directly in a text editor, it’s not completely unreasonable to have a reader as well as a pretty printer. This does require you to have a semi-defined textual format in parallel to your binary one, but because it’s not a primary method of interchange the requirements and costs of it are lower (in particular it doesn’t have nearly as serious performance and security concerns)

But I think this is the wrong way to do it. I realised on further thought about this that the tool chain argument misses out something really quite crucial. It’s not surprising – I think it goes back to before this was actually a valid point – but once I realised it I was kicking myself for how blindingly obvious it is.

UNIX is not your only toolchain

There are programming languages other than shell, many of them with very good interactive shells. As soon as you’ve got bindings to python, or ruby, or erlang, or even freaking javascript, you have an interactive environment and a rich set of libraries with which to operate on your data format.

By manipulating your data format in these languages instead of in shell, you gain a much finer degree of control and greater degree of expression than you can manage in the primitive language of shell. Further, you do it without any requirements on representation – the bits that are human readable you can manipulate easily as strings. The bits you aren’t you can either ignore or process using whatever libraries for operating on that binary data your language offers you.

This entry was posted in programming on by .