Tag Archives: USD

Living on the edge of academia

I am not an academic.

In fact, as far as computer science goes I have never been an academic. My background is in mathematics.

Nevertheless, I try to read a lot of papers. I’m interested (albeit currently only in an armchair way) in programming language implementation and theory. I occasionally feel the need to implement a cool data structure for fun or profit. And, most importantly, I need to for my job.

I work on SONAR. As such a great deal of what I do is natural language processing and data mining. It’s a fairly heavily studied field in academia and it would be madness for me to not draw on that fact. Eventually I’d like to feed back into it too – we do a bunch of interesting things in SONAR and I’m reasonably confident that we can isolate enough of it into things we can share without giving the entire game away (we need to sell the product after all, so giving away the entire workings of it is non ideal).

I’ve got to say. It’s a pretty frustrating process.

Part of the problem is what I’ve come to refer to as the “Academic firewall of doom”. Everything lives behind paid subscriptions. Moreover, it lives behind exhorbitantly priced paid subscriptions. Even worse: None of the ludicrous quantities of money I’d be paying for these subscriptions if I choose to pay for them would go to the authors of the papers. The problems of academic publishing are well known. So assuming I don’t want to pay the cripplingly high fees for multiple journals, I have to either rely on the kindness of authors in publishing their papers on their own websites (a practice I’m not certain of the legal status of but am exceedingly grateful for), or if they have not done so pay close to $100 for a copy of the paper. Before I know if the paper actually contains the information I need.

Another unrelated problem which is currently being discussed on reddit in response to an older article by Bill Mill is that very few of the papers I depend on for ideas come with reference implementations of the algorithms they describe. I understand the reasons for this. But nevertheless, it’s very problematic for me. The thing is: Most of these papers don’t work nearly as universally as one might like to believe. They frequently contain algorithms which have been… let’s say, empirically derived. Take for example the punkt algorithm which I’m currently implementing (which has an implementation in NLTK, which at least gives me confidence that it works. I’m trying not to look at it too much though due to its GPLed nature). It starts out with a completely legitimate and theoretically justified statistic to do with colocations. That’s fine. Except it turns out that it doesn’t work that well. So they introduce a bunch of additional weightings to fuzz the numbers into something that does work and then through experimentation arrive at a test statistic which does a pretty good job (or so they claim).

That’s fine. It’s how I often work too. I’m certainly not judging them for that. But it does mean that I’m potentially in for a lot of work to reproduce the results of the paper (hopefully only a couple of days of work. But that’s a lot compared to firing up the reference implementation) for very uncertain benefits. And additionally it means that if it doesn’t work I don’t know whose fault it is – mine or the algorithm’s. So then if it doesn’t work I have to make a decision whether to spend time debugging or to cut my losses and run.

It should also be noted that negative results are, for whatever reason, the overwhelming majority. I’ve written enough similar things that actually work (at least some of which are based on papers) that I choose to believe that at least not all of these are due to my incompetence.

So, to recap, I’m in a situation where in order to find out if any given paper is useful to me I need to a) Spend near to $100 if I’m unlucky and b) In the event that the paper actually does contain what I want, take several days of work to implement its results and c) In the common case get negative results and even then not be certain about the validity of them so potentially spend more time debugging.

So the common case is designed in such a way as to cost me a significant amount of money and effort and is unlikely to yield positive results. You can understand why I might find this setup a bit problematic.

This entry was posted in programming, Science and tagged , on by .

Playing with Arrows

Apologies in advance for the quality of the html blogger seems to be generating. I probably need to find a better solution (like using lhs and attaching pdfs or something).

In the recent discussion about run length encoding, one thing people focused on was the use of the &&& operator and how noone understands arrows so &&& is scary and arcane.

I feel perfectly placed to comment on this, because I don’t understand arrows either. :-) I cheerfully abuse the arrow operators for their function instance without really worrying about that, and it seems to work fine.

So, this is just a rundown of what the arrow operations do for the function instance, in order to make them seem less scary. I might follow up with looking at Kleisli arrows later (which are arrows that arise from monads in a natural way).

The arrow class is defined as follows:

class Arrow a where
arr :: (b -> c) -> a b c
pure :: (b -> c) -> a b c
(>>>) :: a b c -> a c d -> a b d
first :: a b c -> a (b, d) (c, d)
second :: a b c -> a (d, b) (d, c)
(***) :: a b c -> a b' c' -> a (b, b') (c, c')
(&&&) :: a b c -> a b c' -> a b (c, c')

The first two are totally uninteresting for functions (they’re just the identity). So if we cut those out and specialise the signatures to functions we get the following:

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
first :: (b -> c) -> (b, d) -> (c, d)
second :: (b -> c) -> (d, b) -> (d, c)
(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(&&&) :: (b -> c) -> (b -> c') -> b -> (c, c')

i.e. these are basically all operators for manipulating combinations of pairs and functions. Most of these you can probably infer the purpose of from their types and names, but I’ll go through them anyway. We’ve already encountered &&&, and it’s the one I use the most, so I’ll go through these in reverse order.

The &&& operator is in fact very simple, but it’s the arrow operator I use the most. Consider the following:

> (+1) &&& (+2) $ 3
(4, 5)

i.e. f &&& g when applied to x just evaluates both and returns them as a pairs of both results.

We could define this as:

(&&&) :: (a -> b) -> (a -> c) -> (a -> (b, c))
(f &&& g) x = (f x, g x)

For example, in the run length encoding article we had the function head &&& length:

> head &&& length $ [1, 3]
(1,2)

In a similar usage to the RLE article, I’ve often found the following sort of trick useful:

map (head &&& length) . group . sort

This counts frequencies of elements in a list.

> map (head &&& length) . group . sort $ ["foo", "bar", "foo", "baz"]
[("bar",1),("baz",1),("foo",2)]

Now ***.

> (+1) *** (+2) $ (3, 4)
(4, 6)

i.e. f *** g applied to (x, y) returns (f x, g y). Or, in a more readable form:

(***) :: (b -> c) -> (b' -> c') -> (b, b') -> (c, c')
(f *** g) (x, y) = (f x, g y)

I don’t really have an obvious use case for this one. I’m sure they exist though.

I’m going to pick up the speed now, as I’m getting bored and so you probably are too. :-)

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g (f x)

Or in other words, >>> is just reverse function composition. So we could have written this as

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(f >>> g) x = g . f

Or

(>>>) :: (b -> c) -> (c -> d) -> (b -> d)
(>>>) = flip (.)

Now for first and second:

first :: (b -> c) -> (b, d) -> (c, d)
first f (x, y) = (f x, y)

second :: (b -> c) -> (d, b) -> (d, c)
second f (x, y) = (x, f y)

i.e. These just take a function and apply it to the first or second entry of a tuple, leaving the other unchanged.

Those are all the core arrow operations. There are a few derived operations, but the only one which is at all interesting for functions is <<<, which is just function composition. Another Arrow related class which functions are an instance of is ArrowChoice. ArrowChoice does for Either what Arrow does for pairs (blah blah, category theory, blah, (,) and Either are dual, blah). Here's the instance declaration:

class Arrow a => ArrowChoice a where
left :: a b c -> a (Either b d) (Either c d)
right :: a b c -> a (Either d b) (Either d c)
(+++) :: a b c -> a b’ c’ -> a (Either b b’) (Either c c’)
(|||) :: a b d -> a c d -> a (Either b c) d

Specialised to functions:

left :: (b -> c) -> (Either b d) -> (Either c d)
right :: (b -> c) -> (Either d b) -> (Either d c)
(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d

Lets jump straight to writing out some definitions:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f (Left x) = Left $ f x
left _ (Right y) = Right y

right :: (b -> c) -> (Either d b) -> (Either d c)
right f (Right x) = Right $ f x
right _ (Left y) = Left y

(+++) :: (b -> c) -> (b' -> c') -> (Either b b') -> (Either c c')
(f +++ g) (Left x)  = Left $ f x
(f +++ g) (Right x) = Right $ g x

(|||) :: (b -> d) -> (c -> d) -> (Either b c) -> d
(f ||| g) (Left x)  = f x
(f ||| g) (Right x) = g x

i.e. left f applies f to the left option of an Either and leaves the right option alone.

> left (+1) $ Left 2
Left 3

> left (+1) $ Right 2
Right 2

right does the reverse.

f +++ g applies f to the left option and g to the right:

> (+1) +++ (+2) $ Left 2
Left 3

> (+1) +++ (+2) $ Right 2
Right 4

So we could have implemented left and right (and indeed, this is how they’re actually implemented in the real instance declaration) as follows:

left :: (b -> c) -> (Either b d) -> (Either c d)
left f = f +++ id

right :: (b -> c) -> (Either d b) -> (Either d c)
right f = id +++ f 

Or even as (+++id) and (id+++).

Finally, (f ||| g) x is basically just a combinator for case matching, applying f if we have a Left value, g if we have a Right. It’s just the standard ‘either’ function but it can be nice to have it available as an operator.

> (+1) ||| (+2) $ Left 2
3

> (+1) ||| (+2) $ Right 2
4

So, there you have it. Arrow operations. Not too scary, and now you have a bunch of new combinators to play with. I doubt it will revolutionise your Haskell code, but every now and then they allow for a really neat solution you wouldn’t otherwise have thought of. Have fun.

This entry was posted in programming and tagged , , on by .