Category Archives: programming

Why mathematics makes you better at programming (and so does everything else)

There’s been a bunch of discussion on whether mathematics is programming, whether you need to learn mathematics to program, etc. on Twitter recently. Some of it has been quite heated. So naturally I thought I’d wade right into the hornet’s nest because I’ve got no sense of self-preservation.

There are essentially 4 levels of question going on here:

  1. Is programming basically mathematics?
  2. Do you need mathematics to be a good programmer?
  3. Is there a useful mathematics of programming?
  4. Can learning mathematics make you a better programmer?

The answers to these questions are respectively no, no, yes but you probably don’t care most of the time and yes.

Cool, well, that was easy. Are we done?

Not just yet. I’m going to spend some time unpacking the last of these because it’s the most interesting: Does learning mathematics make you a better programmer?

Well, yes. Of course it does.

So does learning to write. So does learning to cook. So does learning about philosophy of ethics, or about feminism. Learning another human language almost certainly helps you to be a better programmer. Your programming will probably improve if you learn Judo, or knitting, or tap dancing.

I’d go as far as to say that it is unusual to find a pair of skills where learning one will not in some way improve your ability in the other.

Part of this is just that whenever you learn a thing, you’re also learning about learning. You learn what works for you and what doesn’t, and the next time you need to learn something you’ll be that much better at it.

But there’s more to it than that, right? Surely mathematics teaches you more about how to program than just about how to learn things.

Well, yes, but it’s not because by learning mathematics you’re necessarily learning anything directly applicable to programming. Sure, you might be, and you might even be doing the sort of programming to which that maths is directly applicable. But if even if you’re not, learning maths will still make you a better programmer than say, reading Derrida (probably. I haven’t read Derrida so I might be wrong) or learning to cook (Almost certainly. I have learned to cook, and mathematics was way more useful. Your brain might work differently than mine).

Why?

Well, in much the same way that there is a sort of generalised skill of learning that you can improve just by learning things, there are all sorts of other generalised skills. For want of a better word (if you have a better word, do tell) I call them “microskills”. They’re mostly not things that people think of as skills – it’s things like breaking a problem down into parts, abstract thinking, learning to ask the right questions, perseverance when stuck, learning when not to persevere when stuck, and thousands upon thousands of others.

People tend not to think of these as skills because they think of them as innate abilities. Stuff you’re just good at or bad at and you’ve got to cope with that. And I mean, it’s certainly possible that you can be innately good or bad at these (and I may be foolish enough to touch this debate but I’m not touching that one with a barge pole), but regardless of innate ability they all share a characteristic that makes them skills: they get better if you practice them.

Almost every skill you learn will depend on some of these microskills. Which ones is not at all fixed: People compensate by trading off one for the other all the time. You might be less good at working memory, so you practice your organisation skills and keep meticulous notes. You might be easily distractable and lack the ability to methodically push through a problem but make up for it by creative flashes of insight (yes you can learn to have creative flashes of insight, no creativity is not a fixed thing that you can never get any better at).

But it is at least biased. Different skills will encourage you to draw on different microskills. Mathematics will draw heavily on analytical ones – abstract problem solving, hypothesis generation, breaking down a problem into smaller parts, etc. All of these are things that are super useful in programming. So by learning mathematics you’ve developed all these microskills that will later come in super handy while programming.

But, of course, you can develop the same microskills when programming, right?

Wellll….

OK, no. I’m going to go with yes. You will absolutely develop all of the relevant microskills in programming. Learning to be good at programming by doing programming is 100% a viable strategy and lots of people successfully follow it.

The thing is, you won’t necessarily develop them to the same degree. Some you will develop more, some you will develop less. And this isn’t necessarily because the ones you would develop less in programming wouldn’t be useful to be better at.

There is essentially one key thing required for learning to get better at something (HORRIBLE OVERSIMPLIFICATION ALERT): Feedback

Feedback is how clearly and strongly you discover whether you did well or badly at a thing. If you can just try something and immediately find out if it worked or failed, that’s great feedback. If you try something and 3 months later you get a positive result that might be the result of that something (or it might be the result of something else that happened in the intervening 3 months), well… not so much.

Feedback is essentially how you get better because it lets you know that you are getting better, and that’s how you direct your learning. As you get better at a thing, the amount of feedback you tend to get degrades - and this isn’t necessarily because the benefit you’re getting from improving is decreasing (though it often does) it’s often because the benefit is moving further down the line, or into things that are harder to quantify (for example, the benefits of “a sense of good taste” in programming beyond a basic avoidance of complete spaghetti code are often not immediately apparent until you’ve sat atop a mountain of technical debt).

And this is where learning other skills can be super useful, because they provide different feedback loops. Sometimes they even let you develop microskills that appeared completely irrelevant and then once you had them turned out to be super useful (silly example: Strength training to develop core strength. Totally inapplicable to programming, right? Well, yeah, except that it turns out that actually not being distracted by lower back pain while you program is quite handy), but most of the time it’s just that the feedback cycle on them is different and the incentive structure for developing that skill is different – mathematics will lean more on some microskills than programming does and less on others. Similarly it will provide better feedback for some, worse for others.

I could provide some specific examples of which things I think mathematics will help you develop better, but I won’t. It’s very hard to judge these things, and people can argue endlessly about which is better for what. But the specifics don’t matter: Basically all forms of abstract thinking are useful for programming, and the only way that we could reasonably believe that mathematics and programming could provide exactly the same set of feedback loops and incentive structures as each other would be if we believed that they were the same thing.

And that would be silly.

This entry was posted in Numbers are hard, programming on by .

How hard can it be?

There are two types of people in the world:

  1. People who assume jobs they haven’t done and don’t understand are hard
  2. People who assume jobs they haven’t done and don’t understand are easy
  3. People who try to divide the world into overly simple two item lists

Joking aside, there’s definitely a spectrum of attitudes in terms of how you regard jobs you don’t understand.

Developers seem very much to cluster around the “jobs I don’t understand are easy” end (“Not all developers” you say? Sure. Agreed. But it seems to be the dominant attitude, and as such it drives a lot of the discourse). It may be that this isn’t just developers but everyone. It seems especially prevalent amongst developers, but that may just be because I’m a developer so this is where I see it. At any rate, this is about the bits that I have observed directly, not the bits that I haven’t, and about the specific way it manifests amongst developers.

I think this manifests in several interesting ways. Here are two of the main ones:

Contempt for associated jobs

Have you noticed how a lot of devs regard ops as “beneath them”? I mean it just involves scripting a couple of things. How hard is it to write a bash script that rsyncs some files to a server and then restarts Apache?? (Note: If your deployment actually looks like this, sad face).

What seems to happen with devs and ops people is that the devs go “The bits where our jobs overlap are easy. The bits where our jobs do not I don’t understand, therefore they can’t be important”.

The thing about ops is that their job isn’t just writing the software that does deployment and similar. It’s asking questions like “Hey, so, this process that runs arbitrary code passed to it over the network…. could it maybe not do that? Also if it has to do that perhaps we shouldn’t be running it as root” (Lets just pretend this is a hypothetical example that none of us have ever seen in the wild).

The result is that when developers try and do ops, it’s by and large a disaster. Because they think that the bits of ops they don’t understand must be easy, they don’t understand that they are doing ops badly. 

The same happens with front-end development. Back-end developers will generally regard front-end as a trivial task that less intelligent people have to do. “Just make it look pretty while I do the real work”. The result is much the same as ops: It’s very obvious when a site was put together by a back-end developer.

I think to some degree the same happens with front-end developers and designers, but I don’t have much experience of that part of the pipeline so I won’t say anything further in that regard.

(Note: I am not able to do the job of an ops person or the job of a front-end person either. The difference is not that I know that their job is hard therefore I can do it. The difference is that I know that their job is hard so I don’t con myself into thinking that I can do it as well as they can. The solution is to ask for help, or at least if you don’t don’t pretend that you’ve done a good job).

Buzzword jobs

There seems to be a growing category of jobs that are basically defined by developers going “Job X: How hard can it be?” and creating a whole category out of doing that job like a developer. Sometimes this genuinely does achieve interesting things: Cross-fertilisation between domains is a genuinely useful thing that should happen more often.

But often when this happens it’s at the expense of the actual job the developers are trying to replace being done badly, and a lot of the things that were important about the job are lost.

Examples:

  1. “Dev-ops engineer” – Ops: how hard can it be? (Note: There’s a lot of legit stuff that also gets described as dev-ops. That tends to be more under the heading of cross-fertilisation than devs doing ops. But a lot of the time dev-ops ends up as devs doing ops badly)
  2. “Data scientist” – Statistics: How hard can it be?
  3. “Growth hacker” – Marketing: How hard can it be? (actually I’m not sure this one is devs’ fault, but it seems to fit into the same sort of problem)

People are literally creating entire job categories out of the assumption that the people who already do those jobs don’t really know what they’re doing and aren’t worth learning from. This isn’t going to end well.

Conclusion

The main thing I want people to take from this is “This is a dick move. Don’t do it”. Although I’m sure there are plenty of jobs that are not actually all that hard, most jobs are done by people because they are hard enough that they need someone dedicated to doing them. Respect that.

If you really think that another profession could benefit from a developer insight because they’re doing things inefficiently and wouldn’t this be so much better with software then talk to them. Put in the effort to find out what their job involves. Talk to them about the problems they face. Offer them solutions to their actual problems and learn what’s important. It’s harder than just assuming you know better than them, but it has the advantage of being both the right thing to do and way less likely to result in a complete disaster.

This entry was posted in life, programming, rambling nonsense on by .

What is good code?

A long long time ago, in an office building a couple of miles away, I was asked this question in an interview. I’m not sure how much they cared about the answer, but mine was more glib than it was useful (I got the job anyway). I misquoted Einstein and said that good code was code that was as simple as possible but no simpler.

This was not a very satisfactory answer to me, though I got the job.

For, uh, reasons this question has been on my mind recently and I think I’ve come up with an answer that satisfies me:

Good code is code that I am unlikely to need to modify but easily could if I wanted to.

Of course, good code is probably only roughly correlated with good software.

This entry was posted in programming on by .

Different types of overheads in software projects

I mentioned this concept on Twitter in a conversation with Miles Sabin and Eiríkr Åsheim last night, and it occurred to me I’ve never written up this idea. I call it my quadratic theory of software projects.

It’s one I originally formulated in the context of program languages, but I’ve since decided that that’s over-simplistic and really it’s more about the whole project of development. It probably even applies perfectly well to things that are not software, but I’m going to be focusing on the software case.

The idea is this: Consider two properties. Call them “effort” and “achievement”, say. If I wanted to attach a more concrete meaning to those, we could say that “effort” is the number of person hours you’ve put into the project and “achievement” is the number of person hours it could have taken an optimal team behaving optimally to get to this point, but the exact meaning of them doesn’t matter – I only mention to give you an idea of what I’m thinking of with these terms.

The idea is this: If you plot on a graph, with achievement on the x axis and the amount of effort it took you to get there on the y axis, what you will get is roughly quadratic.

This isn’t actually true, because often there will be back-tracking – the “oh shit that feature is wrong” bit where you do a bunch of work and then realise it wasn’t necessary. But I’m going to count that as achievement too: You developed some stuff, and you learned something from it.

It’s also probably not empirically true. Empirically the graph is likely to be way more complicated, with bits where it goes surprisingly fast and bits where it goes surprisingly slow, but the quadratic is a useful thought tool for thinking about this problem.

Why a quadratic?

Well, a quadratic has three parts. We’ve got \(y = A + B x + C x^2\). In my model, \(A, B, C \geq 0\).

And in this context, each of those three parts have a specific meaning:

The constant component (A) is the overhead that you had to get started in the first place – planning, familiarising yourself with the toolchain, setting up servers, etc.

The linear factor (B) is how hard it is to actually make progress – for example, if you’re developing a web application in C++ there’s an awful lot of difficulty in performing basic operations, so this factor could be quite high. Other factors that might make it high are requiring a detailed planning phase for every line of code, requiring a 10:1 lines of test code to lines of application code, etc.

The quadratic factor (C) is the interesting one – constant and linear overhead are in some sense “obvious” features, but the quadratic part is something that people fail to take into account when planning. The quadratic overhead is how much you have to think about interactions with what you’ve already done. If every line of code in my code-base affects every other line in my code-base, then I have to deal with that: every line I write, I have to pay an overhead for every line I’ve already written. If on average a line of code interacts with only 10% of the other lines in the project, then I have to pay 10% of that cost, but it’s still linear in the size of the code base (note: I’m implicitly assuming here that lines of code is a linear function of achievement. I think in reality it’s going to be more complicated than that, but this whole thing is an oversimplification so I’m going to ignore that). When you have to pay a cost that is linear in your current progress, the result is that the total amount of cost you’ve paid by a given point is quadratic in your current progress (this is because calculus).

To use an overused word, the quadratic factor is essentially a function of the modularity of your work. In a highly modular code base where you can safely work on part of it without having any knowledge of most of the rest, the quadratic factor is likely to be very low (as long as these parts are well correlated with the bits you’re going to need to touch to make progress! If you’ve got a highly modular code base where in order to develop a simple feature you have to touch half the modules, you’re not winning).

There are also other things that can contribute to this quadratic factor. e.g. the amount that you have to take into account historical context: If a lot of the reasons why things are done is historical, then you have a linear amount of history you need to take into account to do new work. These all essentially work out as the same sort of thing though: The fraction of what you’ve already done you need to take into account in order to do new things.

So here’s the thing: Your approach to development of a project essentially determines these values. A lot of different aspects will influence them – who your team members are, what language you’re working in, how many of you there are, how you communicate, whether you’re doing pair programming, whether you’re doing test driven development, how you’re doing planning, etc and etc. Almost everything you could call “development methodology” factors in somehow.

And if you compare two development methodologies you’d find in active use for a given problem, it’s going to be pretty rare that one of them is going to do better (i.e. have a lower value) on all three of these coefficients: Some approaches might have a lower constant and linear overhead but a remarkably high quadratic overhead, some approaches might have a very high constant overhead but a rather low linear and quadratic, etc. Generally speaking, something which is worse on all three is so obviously worse that people will just stop doing it and move on.

So what you end up doing is picking a methodology based on which constants you think are important. This can be good or bad.

The good way to do this is to look at your project size and pick some constants that make sense for you. For small projects, the constant costs will dominate, for medium projects the linear costs will dominate, and large projects the quadratic costs will dominate. So if you know your project is just a quick experiment, it makes sense to pick something with low linear and constant costs and high quadratic costs, because you’re going to throw it away later (of course, if you don’t throw it away later you’re going to suffer for that). If you know your project is going to be last a while, it makes sense to front-load on the constant costs if you you can reduce the quadratic cost. In between, you can trade these off against eachother at different rates – maybe gain a bit on linear costs by increasing the quadratic cost slightly, etc.

The bad way to do it is to automatically discount some of these as unimportant. If you just ignore the quadratic cost as not a thing you will find that your projects get mysteriously bogged down once you hit a certain point. If you’re too impatient to pay constant costs and just leap in and start hacking, you may find that the people who sat down and thought about it for a couple hours first end up sailing past you. If you think that scalability and the long-term stability of the project is all that matters then people who decided that day to day productivity also mattered will probably be years ahead of you.

Sadly, I think the bad way to do it is by far the more common. I’ll grant it’s hard to predict the future of a project, and that the relationship between methodologies and these values can be highly opaque, which can make this trade-off hard to analyse, but I think things would go a lot better if people at least tried.

This entry was posted in programming, rambling nonsense on by .

Reading code by asking questions

Peter Seibel wrote a piece about code reading recently. It’s a good piece which meshes well with my experience of code reading, and it got me thinking about how I do it.

I think there are three basic tenets of my code reading approach:

  1. The goal of code reading is to learn how to modify the code. Sure your ultimate goal might be to understand the code in some abstract sense (e.g. because if you want to use the ideas elsewhere), but ultimately code you don’t know how to modify is code you probably don’t understand as well as you think you do, and code you do know how to modify is code that you probably understand better than if you’d merely set out to understand it.
  2. The meaning of code is inextricably tied with its execution. In order to understand code you need to be able to follow its execution. You can do a certain amount of this in your head by manually tracing through things (and you will need to be able to), but you have a machine sitting in front of you designed to execute this code and you should be using it for that. For languages with a decent debugger, you even have a machine sitting in front of you which will execute the code and show you its working. For languages without a decent debugger (or setups where it’s hard to use one), you can still get a hell of a lot of mileage out of the humble print statement.
  3. Ask many small questions. Ignore everything you do not need to answer the current question.

Many people completely rewrite code in order to understand it. This is an extreme form of learning to modify it – modification through rewriting. Sometimes this is fine – especially for small bits of code – but it’s pretty inefficient and isn’t going to be much help to you once you get above a few hundred lines. Most code bases you’ll need to read are more than a few hundred lines.

What you really want to be doing is learning through changing a couple lines at a time, because then what you are learning is which lines to change to achieve specific effects.

An extremely good tool for learning this is fixing bugs. They’re almost the optimal small question to ask: Something is wrong. What? How do I fix it? You need to read enough of the code to eliminate possibilities and find out where things are actually wrong, and you’ve got a sufficiently specific goal that you shouldn’t get too distracted by bits you don’t need.

If you don’t have that, here are some other small questions you might find useful to ask:

  1. How do I run  this code?
  2. How do I write a test for this code? This doesn’t necessarily have to be in some fancy testing framework (though it’s often nice if it is!). It can just be a script or other small program you can run which will tell you if something goes wrong.
  3. Pick a random line in the codebase. It doesn’t have to be uniformly at random – a good algorithm might be to pick one half of a branch in a largish function in a module you’re interested in. How do I get that line to execute? Stick an assert false in there to make sure the answer is right. If there’s a test suite with low coverage, try finding an uncovered line and writing a test which executes it.
  4. Pick a branch. What will happen if I invert this branch?
  5. Pick a constant somewhere. If I wanted to make this configurable at runtime, what would I need to do?
  6. Specific variations on “How can I break this code?”. e.g. in C “Can I get this code to read from/write to an invalid address?” is often a useful question. In web applications “Can I cause a SQL/XSS/other injection attack?”  is one. This forces you to figure out how data flows to the system through various endpoints, and then if you succeed in finding such a bug then you get to figure out how to fix it.
  7. How can I write a test to verify this belief I have about the code?
  8. What would I need to change to break this test?
This entry was posted in programming on by .