It might be worth learning an ML-family language

It’s long been a popular opinion that learning Haskell or another ML-family language will make you a better programmer. I think this is probably true, but I think it’s an overly specific claim because learning almost anything will make you a better programmer, and I’ve not been convinced that Haskell is a much better choice than many other things in terms of reshaping your thinking. I’ve never thought that you shouldn’t learn Haskell of course, I’ve just not been convinced that learning Haskell purely for the sake of learning Haskell was the best use of time.

But… I’ve been noticing something recently when teaching Python programmers to use Hypothesis that has made me reconsider somewhat. Not so much a fundamental reshaping of the way you think as a highly useful microskill that people seem to struggle to learn in dynamically typed languages.

That skill is this: Keeping track of what the type of the value in a variable is.

That may not seem like an important skill in a dynamic language, but it really is: Although functions will tend to be more lenient about what type of value they accept (is it a list or a tuple? Who cares!), they will tend to go wrong in interesting and confusing ways when you get it too wrong, and you then waste valuable debugging time trying to figure out what you did wrong. A good development workflow will typically let you find the problem, but it will still take significantly longer than just not making the mistake in the first place.

In particular this seems to come up when the types are related but distinct. Hypothesis has a notion of a “strategy”, which is essentially a data generator, and people routinely seem to get confused as to whether something is a value of a type, a strategy for producing values of that type, or a function that returns a strategy for producing the type.

It might be that I’ve just created a really confusing API, but I don’t think that’s it – people generally seem to really like the API and this is by far the second most common basic usage error people make with it (the first most common is confusing the functions one_of and sampled_from, which do similar but distinct things. I’m still trying to figure out better names for them).

It took me a while to notice this because I just don’t think of it as a difficult thing to keep track of, but it’s definitely a common pattern. It also appears to be almost entirely absent from people who have used Haskell (and presumably other ML-family languages – any statically typed language with type-inference and a bias towards functional programming really) but I don’t know of anyone who has tried to use Hypothesis knowing an ML-family language without also knowing Haskell).

I think the reason for this is that in an ML family language, where the types are static but inferred, you are constantly playing a learning game with the compiler as your teacher: Whenever you get this wrong, the compiler tells you immediately that you’ve done so and localises it to the place where you’ve made the mistake. The error messages aren’t always that easy to understand, but it’s a lot easier to figure out where you’ve made the mistake than when the error message is instead “AttributeError: ‘int’ object has no attribute ‘kittens'” in some function unrelated to where you made the actual error. In the dynamically typed context, there’s a larger separation between the observed problem and the solution, which makes it harder to learn from the experience.

This is probably a game worth playing. If people are making this error when using Hypothesis, they I’d expect them to be making it in many other places too. I don’t expect many of these errors are making it through to production (especially if your code is well tested), but they’re certainly wasting time while developing.

In terms of which ML-family language to choose for this, I’m not sure. I haven’t actually used it myself yet (I don’t really have anything I want to write in the space that it targets), but I suspect Elm is probably the best choice. They’ve done some really good work on making type errors clear and easy to understand, which is likely exactly what you need for this sort of learning exercise.

 

This entry was posted in programming, Python on by .

Seeking recommendations for better sleep data

OK. Lets start with the problem.

The ultimate problem is that I often/usually wake up feeling like shit. I am in the process of attempting to determine whether this is caffeine related by removing caffeine from my life, but I would describe myself as something like 80% confident that it is not. I don’t believe I have sleep apnea, but checking that is also something I will be trying to do at some point.

I have been using a Jawbone UP for sleep tracking. I am considering replacing it with a Ouija board in order to get more useful and relevant information. It appears to be categorically unable to detect whether I am actually asleep, let alone what sort of sleep I am currently experiencing (though I have been getting some benefit out of the sleep tracking functionality).

However, it has highlighted one interesting thing: I noticed this morning that my heart rate spiked to > 100bpm twice last night. My “normal” heart rate is somwhere in the region of 55bpm, so that’s quite a spike.

And I think I’ve seen that before, just based on the shape of the graph, though it’s not consciously registered.

So why don’t I look at the data?

Well, I’d love to! Except Jawbone throw away all the detailed heart rate data that’s more than 24 hours old, and don’t let you access the detailed data even from the last 24 hours. I’d be fine with scripting a data export and just dumping it somewhere, but it’s completely impossible in a classic internet of things “You thought that just because you own the device you can actually make use of its functionality? Sucker” backstab.

So that’s the proximate problem: I would like to answer the question “Does my heart rate routinely spike during the night and is that indicative of something?”

The obvious way of answering this is using some sort of continuous recording of heart rate data which has the controversial and exciting feature of my actually being able to access the data.

I only really need this while I am asleep, but it would be nice to have it while I’m out and about too.

So, here is approximately what I am looking for:

  1. A wearable heart rate monitor with accurate data. I am perfectly happy for this to be a chest strap based one rather than a watch.
  2. Which I can get a complete data export from. Ideally I would be able to do this for historic data without needing a nearby bluetooth capable device to send the data to live.
  3. That isn’t ruinously expensive (definitely not > £100. Ideally more in the £50 region).

I currently have three contenders:

  1. A fitbit. They provide full historical heart rate data and an API you can get it from. I am mostly not just going straight for this one because I’ve heard fairly bad things about the accuracy of the fitbit’s heart rate monitoring.
  2. A Zephyr. People seem to mostly be recomming the ruinously expensive Bioharness, but they also have the merely slightly pricey HxM. I do not believe I can get historic data out of them only live. These are mostly recommended because they are accurate and have a good bluetooth API.
  3. Get a cheap chest strap monitor that speaks the standard heart rate service bluetooth specification (there are some really cheap ones), and try out the various android apps that speak it, then later if that works well, write a small Python script to just dump it to a database and run it on a raspberry pi or something next to my bed.

I’m currently most tempted to try the third option first despite it being the “worst” in many regards (requiring the most manual effort on my part). Heart rate service is pretty standard, as far as I can tell, so I can experiment with a heart rate monitor that costs ~£15 and upgrade if the data looks promising (e.g. the Polar heart rate monitors are supposedly quite good and speak heart rate service).

I am however not very satisfied by any of these options, and am open to general advice, recommendations, etc on any of the above, on any point from the proximate to the ultimate problem. So please share. Comments are open on this post, and you’re also welcome to tweet or email me.

This entry was posted in Uncategorized on by .

First Past The Post is not the problem, districts are

It will come as no surprise to anyone that I am thoroughly against the systems used for electing representatives in both the UK and the US.

What may come as a surprise is that I’m pretty much indifferent to the fact that they both use first past the post, or simple plurality voting.

Don’t get me wrong. First past the post (FPTP from now on) is a rubbish voting system. It has essentially no redeeming values that are not also shared by approval voting, which is in all ways a superior system to it (I’m not massively in favour of approval voting, but it’s unambiguously better than FPTP). The point is not that FPTP is good, it’s that in this context it is irrelevant.

The reason it is irrelevant is that almost any choice of voting system in its place would produce just as bad or worse results, because the most major failure is in how we are applying voting, not the voting we’re doing.

To see this, consider the following scenario:

  1. We’ve got some geographical districts and each one gets exactly one representative.
  2. The representatives are divided into two parties. Call them the Purple and Green parties.
  3. The purple party has 50.1% of the populace who think that Purple is amazing and Green is literally the worst. The other 49.9% think the opposite.
  4. This division is uniformly spread over the entire country with little to no local variation

What happens? Well, you get an entirely purple house of representatives! Because in each district you have a strict majority of people who prefer Purple. Green gets exactly zero representatives despite the fact that 49.9% of the populace are strongly in favour of Green.

This is manifestly ridiculous, and the resulting government cannot be said to have any significant democratic mandate over an equally ridiculous 100% Green party. I’ll leave it up to you to decide how many more seats they should have than Green, but I’ll take it as read that getting all the seats is not an acceptable answer.

And getting this scenario required almost nothing about first past the post. If you replaced it with approval voting, alternative vote, range voting, whatever, the result would still be the same, because it boiled down to a simple majority vote and the majority in each district genuinely preferred Purple.

The choice of voting system influences what sort of distortions you can get. As your voting systems get better the above tends to be the only sort of distortion you get, but it will always remain possible. With districts, you can always take whatever single result you’d get if you ran the vote over the whole populace and just give them 100% of the seats by just spreading the population out this way (note: There are some technical conditions required for this to be strictly true, but it is in practice true with basically everything).

In practice the distortions are usually not this extreme because most of the time the distribution is not uniform. But that doesn’t make the result more representative of the populace, it just makes it a consequence of the vagaries of where people happen to live. You then get Gerrymandering happening, which is essentially the art of deliberately creating these distortions in a way that furthers your political aims.

The only way to avoid this problem altogether and also have districts is to have a system where some districts can genuinely have a majority preference for a candidate but that candidate still doesn’t get elected. This isn’t as unreasonable as it sounds. Two examples of reasonable ways to do this are random ballot and biproportional apportionment. However both of these are niche and you’re unlikely to get them.

Amongst systems with mainstream support, you really need to do away with districting in part or in whole. You need to move to a full proportional representation system across the whole country, or across larger multi-member districts.

But one thing you can’t do is just tinker with the system that you use in each district if you want to make a difference. Changing the voting system you use within districts is just rearranging deck chairs on the titanic, and continuing to focus on first past the post as a problem is just going to get you another vote on the optimal deck chair rearrangement at best rather than getting people off the sinking ship.

This entry was posted in voting on by .

Contributors do not save time

(This is based off a small tweet storm from yesterday).

There’s this idea that what open source projects need to become sustainable is contributors – either from people working at companies who use the project, or from individuals.

It is entirely wrong.

First off: Contributors are great. I love contributors, and I am extremely grateful to anyone who has ever contributed to Hypothesis or any of my other open source projects. This post is not intended to discourage anyone from contributing.

But contributors are great because they increase capabilities, not because they decrease the effort required. Each contributor brings fresh eyes and experience to the project – they’ve seen something you haven’t, or know something you don’t.

Generally speaking a contribution is work you weren’t going to do. It might be work you were going to do later. If you’re really unlucky it’s work you’re currently in the process of doing. Often it’s work that you never wanted to do.

So regardless of what the nature of the contribution, it creates a sense of obligation to do more work: You have to deal with the contributor in order to support some work you weren’t going to do.

Often these dealings are pleasant. Many contributions are good, and most contributors are good. However it’s very rare that contributions are perfect unless they are also trivial. The vast majority of contributions that I can just say “Thanks!” and click merge on are  things that fix typos. Most of the rest are ones that just fix a single bug. The rest need more than the couple of minutes work (not zero work, mind you) that it took to determine that it was such a contribution.

That work can take a variety of forms: You can click merge anyway and fix it yourself, you can click merge anyway and just deal with the consequences forever (I don’t recommend this one), you can talk the contributor through the process of fixing it themselves, or you can reject the contribution as not really something you want to do.

All of these are work. They’re sometimes a lot of work, and sometimes quite emotionally draining work. Telling someone no is hard. Teaching someone enough of the idiosyncracies of your project to help them contribute is also hard. Code review is hard.

And remember, all of this is at the bare minimum work on something that you weren’t previously going to do just yet, and may be work on something that you were never going to do.

Again, this is not a complaint. I am happy to put in that work, and I am happy to welcome new contributors.

But it is a description of the reality of the situation: Trying to fix the problems of unpaid labour in open source by adding contributors will never work, because it only creates more unpaid labour.

This entry was posted in programming on by .

Against Virtue Environmentalism

I came up with the term “Virtue Environmentalism” recently and I think it’s a good one and will probably be using it more often.

This came up when talking to a friend about a frustrating experience he’d had. Afterwards he vented at me about it for a bit and we had a good conversation on the subject.

The friend in question cares a lot about the environment. He’s mostly vegan and donates a lot of money  to a variety of environmental charities and generally spends a fair bit of time stressing out about global warming.

But he also drives. Like, a lot. And I don’t mean a Tesla (I don’t know enough about cars to tell you about fuel efficiency, but it’s a conventional engine). Both short distance and long road trips. There’s no physical reason he has to drive – he’s in tolerably good shape and could definitely cycle a lot of the places he drives to if he wanted, but he really likes driving.

These aren’t inconsistent positions. I don’t think I was the one who convinced him of this, but he’s basically on board with the idea of donating instead of making personal interventions. He’s decided quite reasonably that his life is significantly better for all this driving that he’s willing to make the trade off, and he donates more than enough to be massively carbon negative despite it even without the veganism.

But someone he met at a party recently really took issue with that, basically calling him a hypocrite. I’m not sure how the subject came up, but it got quite heated.

Over the course of the conversation it emerged that the person in question was not vegetarian and did not donate anything to charity, but was very conscientious about taking public transport everywhere they couldn’t cycle, turning off all the lights, recycling everything, doing home composting, etc.

One of these people is making a big environmental difference. The other one is giving the person who is giving a big environmental difference a hard time for not making a big enough difference.

(Note: This account has been somewhat fictionalized to protect the guilty)

I’m going to start describing this behaviour as virtue environmentalism.

The term comes from ethical theory. Approximately, we have consequentialist ethics and virtue ethics (it’s more complicated than that, but that’s the relevant subset here).

Consequentialist ethics says that ethical behaviour comes from acts which produce good outcomes, virtue ethics says that ethical behaviour comes from acts which exhibit virtues.

Similarly, consequentialist environmentalism says that environmental behaviour comes from acts which produce environmentally good consequences, while virtue environmentalism comes from acts which demonstrate environmentally friendly behaviour.

So, donating money to charity is consequentially good but mostly not a virtue – sure, you might as well do it, but it’s not real environmentalism.

My biases are clearly showing here. I largely subscribe to consequentialist ethics, but think virtue ethics has its place. There are good arguments that virtue ethics produces better consequential outcomes in many cases, and also that it produces better adjusted people. I’m not sure I buy these arguments, but it’s a valid life choice.

But virtue environmentalism is mostly bullshit.

Atmospheric carbon and other greenhouse gasses are amongst the most fungible types of harm out there. If I pump 100 tonnes of carbon into the atmosphere (a very high footprint) and extract 110 from it into some sort of long term storage (e.g. donating to prevent deforestation or plant new trees), then I’ve removed ten tonnes of carbon  from the atmosphere and as a result I’ve done more good than someone who has only pumped 5 tonnes of carbon into the atmosphere (a very low footprint) but hasn’t removed any.

Virtue environmentalism largely results in three things:

  1. Spending lots of time and effort on actions that make no practical difference at all but are highly visible.
  2. Feeling good enough about yourself that you don’t perform the actions that would actually help.
  3. Pissing off other people and making them care less about environmentalism overall.

The third is particularly important. If we want our descendants to not gently broil in the inevitable consequences of our own environmental waste, we need to get everyone to start to taking this seriously, and if you keep telling people that the only valid way to do environmental change is this sort of hair-shirt-wearing nonsense then the result will be that people do neither that nor the actually useful actions they would probably be quite happy to do.

If you want to do “environmentally friendly” things that don’t help much but make you feel better then sure, go for it. But stop expecting other people to do the same if you actually want to help the planet instead of just feeling good about yourself.

This entry was posted in Charitable giving, life on by .