How learning Scala made me a better programmer

People will often tell you how learning functional programming / C / their current favourite language paradigm will make you a better programmer even if you don’t actually use it.

I mean, yes, I suppose this is true. It’s very hard for learning new things about programming to not teach you something general purpose, and if a language is unfamiliar enough that it’s not just a syntax change you’re almost certainly going to learn new ways of thinking along with it. I don’t know that it’s the most efficient way to become a better programmer, but it’s definitely not a bad one.

Scala, though, while there are a bunch of neat features which I still miss and occasionally emulate badly, I can point to a much more concrete and useful way in which it made me a better programmer. Unfortunately a) I don’t think it’s likely to work any more and b) You’re not going to like it.

When I was learning Scala back in… 2006 I think it might have been? Certainly early 2007 at the latest. Anyway, sure I was learning interesting things about static typing, object orientation and functional programming (maybe not much of the last one. I already knew Haskell and ML tolerably well), what I actually learned the most about was from an entirely different feature of the language.

Specifically that the compiler was ridiculously buggy.

I assume things are much better than they used to be these days, and I’m sure my memory is exaggerating how bad they were at the time, but it feels like back then the development cycle was compile, run tests, report compiler bug. I think I probably hit more compiler bugs than most people – I’m not sure why; maybe I just had a tendency to push at the edge cases. It could also be that I just was nicer about reporting them than most people, I don’t know. Either way, there was a reasonable period of time when I had the top reported number of bugs on the tracker (I think Paul Phillips was the one who ousted me. Certainly once he came along he blew my track record out of the water).

Back in the dim and distant past when I wrote “No seriously, why Scala?” someone asked the question “Why isn’t a buggy compiler a show stopper?”. There were some pretty good answers, but really the accurate one is just “Well… because it’s not?”. You pretty quickly learn to get used to a buggy compiler and integrate its bugginess into its work flow – not that you come to depend on it, but it just becomes one of those things you know you have to deal with and compensate for. It potentially slows you down, but as long as you have good habits to prevent it tripping you up and the rest of the language speeds you up enough to compensate this isn’t a major problem.

I’m not saying I’d actively seek out a buggy compiler today. Obviously if you can choose to not be slowed down this is better than being slowed down, and no matter how good your procedures for compensating for it eventually you’re going to be hit by a compiler bug in production. If I even need to say it, there are clearly downsides to writing production code with a buggy compiler.

But from the point of view of my development as a programmer it was amazing.

Why?

Well, partly just because being good at writing a decent bug report is a nice skill to have to endear you to a maintainer and this is where I learned a lot of those skills (though it took me being on the wrong end of bad bug reports to properly internalise them).

But mostly I think because just it was really useful having that much practice finding and submitting bugs. In much the same way that we spend more time reading than writing code, we also spend more time finding bugs than writing bugs. Having lots of practice at this turns out to be super helpful.

Compiler bugs have an interesting character to them. Most good advice about debugging advises you to not assume that the bug you’re experiencing is in the compiler, or even your system libraries, and that’s in large part because it indeed rarely is, so you tend not to experience this character too often.

What is that character?

It’s simple: You cannot trust the code you have written. You cannot deduce the bug by reading through the code until you get a wrong answer. The code is lying to you, because it doesn’t do what you think it does.

To an extent this is always true. Your brain does not include a perfect implementation of the language spec and the implementation of the all the libraries, so the code is always capable of doing something different than you think it does, but with compiler bugs you don’t even have the approximation to the truth you normally rely on.

“Code is data” is a bit of a truism these days, but with a compiler bug it’s actually true. Your code is not code, it’s simply the input to another program that is producing the wrong result. You are no longer debugging your code, you are manipulating your code to debug something else.

This is interesting because it forces you to think about your code in a different way. It forces you to treat it as an object to be manipulated instead of a tool you are using to manipulate. It gives you a fresh perspective on it, and one that can be helpful.

How do you debug a compiler bug? Well. Sometimes it’s obvious and you just go “Oh, hey, this bit of code is being compiled wrong”. You copy and paste it out, create a fresh test case and tinker with it a little bit and you’re done.

This will probably not happen for your first ten compiler bugs.

Fortunately there’s a mechanical procedure for doing this. For some languages there’s even literally a program to implement this mechanical procedure. I find it quite instructive to do it manually, but that’s what people always say about tedious tasks that are on the cusp of being automated. Still, I’m glad to have done it.

What is this mechanical procedure?

Simple. You create a test case for the thing that’s going wrong (this might just be “try to compile your code” or it might be “run this program that produced wrong output”. For the sake of minimalism I prefer not to use a unit testing framework here). You check out a fresh branch (you don’t have to do this in source control but there’s going to be a lot of undo/redo so you probably want to). You now start deleting code like it’s going out of fashion.

The goal is to simply delete as much code as possible while still preserving the compiler bug. You can start with crude deletion of files, then functions, etc. You’ll have to patch up the stuff that depends on it usually, but often you can just delete that too.

The cycle goes:

  1. Delete
  2. Run test
  3. If the test still demonstrates the problem, hurrah. Go back to step 1 and delete some more.
  4. If the test no longer demonstrates the problem, that’s interesting. Note this code as probably relevant, undelete it, and go delete something else.

Basically keep iterating this until you can no longer make progress.

When you stop making progress you can either go “Welp, that’s small enough” (generally I regard small enough to be a single file under a few hundred lines, ideally less) or you can try some other things. e.g.

  1. Inlining imported files
  2. Inlining functions
  3. Manually constant folding arguments to functions (i.e. if we only ever call f(x, y, 1) in a program, remove the last argument to f and replace it with 1 in the function body)

The point being that you’re thinking in terms of operations on your code which are likely to preserve the bug. Finding those operations will help you understand the bug and guide your minimization process. Then at the end of it you will have a small program demonstrating it which is hopefully small enough that the bug is obvious.

Is this how I normally debug programs? No way. It’s a seriously heavyweight tool for most bugs. Most bugs are shallow and can be solved with about 10 minutes of reading the code until it’s obvious why it’s wrong.

Even more complicated bugs I tend not to break this out for. What I do take from this though is the lesson that you can transform your program to make the bug more obvious.

Sometimes though, when all else fails, this is the technique I pull out. I can think of maybe half a dozen times I’ve done it for something that wasn’t a compiler bug, and it’s been incredibly useful each time.

All those half dozen times were for a very specific class of bug. That specific class of bug being “ruby”. There’s a hilarious thing that can happen in ruby where namespacing isn’t really the cool thing to do and everything messes around with everyone else’s internal implementation. This potentially leads to some really bizarre spooky interaction at a distance (often involving JSON libraries. *shakes fist*). This technique proved immensely invaluable getting to the bottom of this. For example the time I discovered that if your require order was not exactly right, having a JSON column in datamapper would cause everything using JSON to break. That was fun.

But even when I’m not explicitly using these techniques, it feels like a lot of my debugging style was learned in the trenches of the Scala compiler. There’s a certain ramp up when debugging where you start with intuition and pile on increasingly methodical techniques as the bugs get more mysterious, and I think exposure to a lot of really mysterious bugs helped me learn that much earlier in my career than I otherwise would have.

It’s possible that the fact that it was a compiler was irrelevant. It feels relevant, but maybe I’d have learned variations on the technique at any point when I was daily interacting with a large, buggy piece of software. But to date, the 2007-2008 era Scala compiler is the best example I’ve had of working with such, so it’s entirely possible I’d have never learned that skill otherwise, and that would have been a shame because it’s one that’s served me well on many other projects.

This entry was posted in programming on by .

3 thoughts on “How learning Scala made me a better programmer

  1. Pingback: My Rugby Experience: A Lesson for Programmers | Loop-a-dope

  2. Pingback: Find a bandwagon and jump on! | prose :: and :: conz

  3. Pingback: Best of drmaciver.com | David R. MacIver

Comments are closed.