David R. MacIver's Blog: "Agda is now mainstream"

"Agda is now mainstream"

13 February 2012

Various people on the internets have been pointing out that the results for this is a mainstream language on hammer principle are a bit weird. In particular Agda was showing up as number 10 for it which if you know of Agda you’ll realise how ridiculous it is and if you don’t, well exactly.

It’s not there any more, as I tweaked the algorithm slightly to prune statement/item pairs with very little data, but the basic problem of the confusingness of this particular statement remains. The problem is alluded to on the text of the page, but I expect no one reads that text anyway. Oh well. So, quoted here:

For items where we’ve got a lot of responses the numbers should be pretty good. For the rest you can probably just expect them to hang around the middle of the rankings wondering what they’re doing here.

This is because the way the ranking algorithm works is that it sorts by an initial score and then looks for evidence to move items from that point. If there’s little data that score will be about in the middle and there’ll be little evidence to move it.

This is true on all the pages, but the problem with this particular one is that in this case lack of data is correlated with the correct position of the language in the ranking. This means that particularly non mainstream languages will lack data and thus will be sitting there in the middle, while as only mildly non mainstream languages will be parked down at the bottom because there was a lot of evidence that they weren’t mainstream.

The main problem here is not really the ranking algorithm (although it could be improved. I have an improvement in the works, but it’ll take a while to plumb in), it’s that the site doesn’t make it clear when an item’s position in a list is really more of a rough guess than a reality. I don’t currently have any good ideas as to how to indicate that, so the status quo will probably remain for now.

It’s also worth noting that the aggregate rankings are really more an amusement than they are a reliable source of decision making (especially now that the comparison pages provide you with more hard data). They should inspire, and they should lead your line of inquiry, but certainly don’t take them as rock solid fact. Even if they captured and presented the data perfectly you’re still just looking at a bunch of answers to subjective question from a bunch of randoms on the internet. Fun and interesting, but not necessarily representative of The Truth.

Comments

Edward Kmett on 2012-02-14 03:28:45:

My usual answer for this sort of thing is to turn to the technique described in http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

If you use the lower and upper bounds respectively for the top and bottom ratings then results with few supporting data points will likely not make it into either hit list, and you’ll get more information about the confidence interval that can be displayed if you want to.

This has the benefit that you don’t need to throw out small results up to a threshold and then magically start trusting them, even though you may have 10x as much data for one language than for another, but instead you get a gradual transition to confidence as you get more data.

david on 2012-02-14 23:13:45:

For posterity, we’ve had a discussion about this on the haskell subreddit: http://www.reddit.com/r/haskell/comments/pnupf/agda_is_now_mainstream/c3qz042