Archive for May, 2009

Many eyes make heavy work

Thursday, May 28th, 2009

We were talking in the office the other day about a fun little project for twitter. Basically just looking at what pairs of hash tags get used together. After getting one and a half hours sleep last night, waking up and being unable to get back to sleep I had some time to kill on my hands, so thought I’d throw it together.

Getting and munging the data into a form that gave tweet similarity was easy enough. But what to do with it then? The obvious thing to do is to visualise the resulting graph.

We have our own visualisation software at trampoline (which I did try on this data. It does fine), but I wanted something smaller and more stand alone. I’d heard people saying good things about IBM’s many eyes (in retrospect I may have to challenge them to duels for the sheer affrontery of this claim), so I thought I’d give it a go.

Let me start by saying that there is one single feature that would change my all encompassing burning loathing for Many Eyes into a mild dislike. It alleges to have the ability to let you edit data you have uploaded. Except that the button is disabled with the helpful tooltip “Editing of data is currently disabled”.

This renders the entire fucking site useless, because it takes what should be a trivial operation (editing the data you’ve uploaded to see how it changes the visualisation) into a massive ordeal. You need to create an entirely new dataset, label it, add it, recreate the visualisation…

Fortunately recreating the visualisation isn’t that hard. After all, Many Eyes doesn’t actually give you any functionality with which to customise your visualisation (maybe it does for some of the others. It sure doesn’t for the network viewer).

So why did I need to tinker with the data? Isn’t it just a question of upload a data set, feed it into IBM’s magic happy wonderground of visualisation and go “Oooooh” at the pretty pictures?

Well, it sortof is.

Actually what it is is upload the data set, feed it into IBM’s magic happy wonderground of visualisation and go “Aaaaaaaargh” as my browser grinds to a halt and then crashes.

It’s understandable really. I did just try to feed their crapplet an entire one point six MEGA bytes of data (omgz).

Wait, no. That’s not understandable at all. In fact it’s completely shit. That corresponds to about 12K nodes and about 60K edges. This is *not* a particularly large number (metascope happily lays it out in a few tens of seconds). This is a goddamn data visualisation tool. The whole point of it is that you’re supposed to be able to feed it large amounts of data. At the very least it should tell you “Sorry, I was written by monkeys so probably can’t handle the not particularly large amount of data you have just fed me”.

So, I spent some time trying to prune the data down to a size many eyes could manage to not fail dismally at but where the graph was still large enough to be interesting. This was a hilarious process. Consider:

  • The only way to edit the data is to create an entirely new data set and recreate the visualisation.
  • The only way to determine that I’ve got too much is to observe that my browser crashes.

After about half a dozen iterations of this I decided enough was enough and declared many eyes to be an unusable piece of shite that was not worth my time. Life’s too short for bad software.

Food discovery of the day

Monday, May 25th, 2009

Fried polenta goes really well in greek salad.

Cashew and butterbean non-mous

Sunday, May 24th, 2009

I’ve never had much success with making my own hummous. Whenever I try I end up with something which is ok, but generally a bit stodgy and unexciting.

Earlier I remembered the following amazing dish from vegan yum yum and it occurred to me that the cashews made it much ligher than the beans on their own would have. And thus was born the following recipe:

Ingredients

  • A can of butterbeans
  • 1 cup of raw cashews (I buy these broken and loose from a health food store, so they’re actually quite affordable)
  • 1.5 lemons (whole, but with the peel removed)
  • 1.5 tsp sea salt
  • 1 tbs olive oil
  • 2 tbs tahini

Instructions

The exact process I used to make it involves 73 distinct and precise steps and requires attaining nirvana before its completion. As a fractionally simpler alternative, I recommend just sticking the entire lot in a food processor and blending until it’s smooth.

Result

This is very distinctly not hummous, but it is nevertheless delicious. It has a fairly mild flavour – you can distinctly taste the lemon and cashews, but they don’t dominate – and has a nice creamy texture. Probably works better as a pate rather than a dip, but definitely great for either

Interest bandwidth

Friday, May 15th, 2009

Highlight from the #scala IRC channel:

13:21 < DRMacIver> I have a somewhat unfortunate theory. I suspect the amount one cares about
                   programming issues is inversely proportional to how inherently interesting one's
                   work is.
13:22 < DRMacIver> Or at least negatively correlated
13:22 < ijuma> DRMacIver: is this based on personal experience? ;)
13:22 < DRMacIver> Yes
13:22  * DRMacIver finds it much harder to get worked up about language issues these days
13:23 < DRMacIver> Which is in large part because I'm doing a lot more interesting borderline computer
                   sciencey work
13:24 < dgreensp> yeah, I do a lot of interesting work and only rarely stop to think about languages
13:24 < ijuma> yeah, I noticed. I think it makes sense. People have limited bandwidth and if work is
               interesting, it's likely to take quite a bit of it
13:25 < DRMacIver> I think the other issue is that people want to find what they do interesting. And
                   so if *what* they do isn't interesting they have to become interested in *how* they
                   do it.
13:25 < DRMacIver> But yes, the bandwidth thing is also a big part of it
13:26 < dgreensp> bandwidth minimum, bandwidth maximum :)
13:26 < DRMacIver> ?
13:27 < ijuma> DRMacIver: agreed
13:27 < dgreensp> er, the two points seemed related -- people need to be interested in something, but
                  they can't be interested in too many things at once
13:28 < DRMacIver> Oh, right.
13:28 < DRMacIver> Yes. That's a good way of looking at it.

A problem of language

Friday, May 15th, 2009

I try to stay out of language wars these days. I find the whole endeavour incredibly tedious. I don’t really feel like arguing whether OCaml is better than Ruby is better than Scala is better than brainfuck is better than C. I like them all (ok maybe not brainfuck), and there are valid arguments for and against each of them.

But one thing I have a lot of trouble with is bad arguments. Not “arguments I disagree with”, but arguments which are simply outright bad.

Robert Fischer has a post on his blog: Scala is not a functional language. It’s not the first time this idea has come up. It’s not an idea entirely without merit. Scala’s functional features are certainly not as seamless to use as one might hope.

The problem is that while it is not an idea without merit, its presentation is certainly one without content. As per usual, every time this comes up, no one is actually willing to say what on earth they mean by “functional programming language”. This problem is ubiquitous. Consider the following wikipedia entry:

In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state. Functional programming has its roots in the lambda calculus, a formal system developed in the 1930s to investigate function definition, function application, and recursion. Many functional programming languages can be viewed as embellishments to the lambda calculus.

Notice the bait and switch there? It defines “functional programming” and then talks about “functional programming languages“. Everyone does this. The only unambiguous definition you find which includes the term “functional programming language” is that of Purely functional language. This is much less ambiguous, but it’s also the case that the vast majority of people arguing about whether Scala (or other language of choice) is functional are comparing it to a decidedly impure language. Certainly neither Clojure nor OCaml are purely functional.

On the other hand, if you choose “functional programming language” to simply mean “supports functional programming” then suddenly you have to acknowledge various languages like Ruby as functional and for some reason this makes people uncomfortable. So instead we end up with all sorts of hand waving and mumbling and no one is able to have a useful discussion because everyone is too busy making assertions which can’t be argued with because they don’t mean anything.

So no one defines the term, but everyone seems to “know what it means”. And what it inevitably means is “shares features with this language which I use and like and consider to be a functional language”.

For a particular extreme example, consider the following conversation on reddit from the last time this subject came up:

vagif:
Well, here’s my understanding of this disagreement.
I think many people (me included) do not feel that Scala is functional enough. It is not because of the mutable variables. It all boils down to simple syntax. That’s why completely imperative and old fashioned language CL (i use sbcl) feels to me much more functional than Scala will ever be with its modern functional features like pattern matching, type inference and list comprehensions.
“Functional” for me means simple small core of orthogonal features.
“Functional” for me means – List processing.
“Functional” for me means no premature optimization (deciding to use arrays instead of lists because they are faster etc.)
Obviously, trying to accommodate java OO model, makes Scala way to complicated, arcane, baroque in its syntax, to feel functional enough.

DRMacIver:
I enjoy the fact that the word “function” does not appear in your definition of functional…

vagif:
And for a good reason. Functions are part of any imperative language. That does not make them more functional. It is all that infrastructure, that allows for easy juggling those functions, that makes a language functional. Passing them as parameters and return values, anonymous functions, closures, polymorphic functions (be it a result of type inference or dynamic nature of language). It is this small core of orthogonal features, all geared up for list processing, that truly creates a functional feeling in a programmer.

This sort of woolly thinking is endemic in these arguments. Please try to avoid it. If you’re not willing to define “functional programming language” in a way that is at least moderately unambiguous and doesn’t involve arguments about “feelings” or simply feature comparisons with some language which you state as an example of functional programming, don’t bother making arguments about whether a language is functional or not.

Of course, once you start defining the term people will start arguing about the definitions. This is pretty tedious, I know. But as tedious as arguing about definitions is, it can’t hold a candle to arguing without definitions.