Category Archives: programming

Randomized exhaustive exploration of a many branching tree

Attention conservation notice: This post is largely an obvious in retrospect solution to an obscure problem.

It’s a problem I spent a lot of time considering wrong or complicated solutions too though (this is what the persistent data structure I was considering was for, but it’s not needed in the end), so I thought it was worth writing up the correct solution.

The problem is this: I have a tree of unknown structure. Each node in the tree is either a leaf or has an ordered sequence of N (variable per node) children (technically the leaves are just a special case of this with N=0, but leaves have special properties as we’ll see in a moment).

I want to do an exhaustive search of the leaves of this tree, but there’s a limitation: when walking this tree I may only walk complete branches starting from the root and finishing at a leaf. I cannot back up and choose another branch.

The tree is potentially very large, to the point where an exhaustive search might be impossible. So this should actually be treated as an anytime algorithm, where we’re iteratively yielding new paths to a leaf which some external algorithm is then doing something with. When yielding those we want to guarantee:

  • We never yield the same path twice
  • We know when we have explored the entire tree and stop

It’s easy to do a lexicographic exploration of the branches in sorted order – just build a trail of the choices you took. At each step, increment the last choice. If that causes it to exceed the number of choices available, pop it from the end and increment the previous next choice. When making an unknown choice always choose zero.

But there’s a reason you might not want to do this: Under certain reasonable assumptions, you may end up spending a lot of time exploring very deep leaves.

Those reasonable assumptions are this:

  • All else being equal, we should prefer leaves which are closer to the root (e.g. because each walk to a child is expensive, so it takes less time to explore them and we can explore more leaves / unit time)
  • The expected number depth of a tree below any node given uniformly at random choices is usaully much lower than the maximum depth of the tree elow that point.

Under these assumptions the optimal solution for exploring the tree is instead to just randomly walk it: You will avoid the long paths most of the time because of the expectation assumption, and will do a better job of exploring close to the root.

Unfortunately, pure random exploration does not satisfy our requirements: We might generate duplicate paths, and we don’t know when we’ve explored all the nodes.

So what we want is a hybrid solution that exhaustively searches the tree without duplicates, but chooses its paths as randomly as possible.

Any way, there turns out to be a very simple way of doing this. It’s as follows:

We maintain a list of paths to unexplored nodes, bucketed by their length. This is initially the empty path leading to the root.

We then repeatedly iterate the following steps until there are no unexplored nodes in the list:

  1. Pick an unexplored node of minimal length uniformly at random and remove it from the list
  2. Walk to that node
  3. Walk randomly below that node. Every time we make a random decision on a child, add the path we took to get to this point plus each other decision we could have taken to the list of unexplored nodes.
  4. When we reach a leaf node, yield that path to the controlling algorithm.

Every path we walk goes through a node we have not previously explored, so must be unique. When the list is empty we’ve explored all the nodes and are done.

The correct storage format for the paths is just an immutable singly linked list stored with the last node in the path first – appending a node to the path is just consing it to the front of the list. You’re going to have to do O(n) work to reverse it before walking upwards, but that’s OK: You’re going to have to do O(n) work with n the same and a higher constant factor to walk back up the tree along that path anyway.

The reason for my original need for a data structure with better forward iteration than that was because the algorithm I was originally considering always picked the unexplored node with the shortlex minimal (lexicographically first amongst those with the shortest length) path to it, but there’s no real reason you need to do that and it makes things much more expensive.

If you want to instead do weighted random walks of the tree and are prepared to relax the requirements around exploring shorter paths first a bit, there’s an alternative algorithm you can use (and I’m actually more likely to use this one in practice because I’ll need to do weighted draws) which works as follows:

We build a shadow tree in parallel every time we walk the real tree. The shadow tree is mutable, and each node in it is in one of three states: Unexplored, Active or Finished. It starts with a single root node which is Unexplored.

We maintain the following invariant: Any node in the shadow tree which is not Finished has at least one child node that is not Finished. In particular, leaf nodes are always Finished.

We then decide how to walk the tree as follows: We always walk the shadow tree in parallel, matching our walk of the real tree. Whenever we walk to a node that is Unexplored, we do one of two things:

  1. If the corresponding real node is a leaf, we immediately mark it as Finished (and our walk is now done)
  2. If the corresponding real node has N children, we mark it as Active and create N Unexplored child nodes for it.

Once our walk has complete (i.e. we’ve reached a leaf), we look at the list of nodes in the shadow tree that we encountered  and walk it backwards. If all of the children of the current node are Finished, we mark the node as Finished as well. If not, we stop the iteration.

We can now use the shadow tree to guide our exploration of the real tree as follows: Given some random process for picking which child of the real tree we want to navigate to, we simply modify the process to never pick a child whose corresponding shadow node is Finished.

We can also use the shadow tree to determine when to stop: Once the root node is marked as Finished, all nodes have been explored and we’re done.

Note that these two approaches don’t produce at all the same distribution: If one child has many more branches below it than another, the initial algorithm will spend much more time below that child, while the shadow tree algorithm will explore both equally until it has fully exhausted one of them (even if this results in much deeper paths). It probably depends on the use case which of these are better.

You can use the shadow tree to guide the exploration too if you like. e.g. if there are any unexplored child nodes you could always choose one of them. You could also store additional information on it (e.g. you could track the average depth below an explored node and use that to guide the search).

Anyway, both of these algorithms are pretty obvious, but it took me a long time to see that they were pretty obvious so hopefully sharing this is useful.

As an aside: You might be wondering why I care about this problem.

The answer is that it’s for Hypothesis.

As described in How Hypothesis Works, Hypothesis is essentially an interactive byte stream fuzzer plus a library for doing structured testing on top of this. The current approach for generating that bytestream isn’t great, and has a number of limitations. In particular:

  • At the moment it fairly often generates duplicates
  • It cannot currently detect when it can just exhaustively enumerate all examples and then stop (e.g. if you do @given(booleans()) at the moment the result isn’t going to be great).
  • The API for specifying the byte distribution is fairly opaque to Hypothesis and as a result it limits a number of opportunities for improved performance.

So I’m exploring an alternative implementation of the core where you specify an explicit weighted distribution for what the next byte should be and that’s it.

This then opens it up to using something like the above for exploration. The nodes in the tree are prefixes of the bytestream, the leaves are complete test executions. Each non-leaf node has up to 256 (it might be fewer because not every byte can be emitted at every point) child nodes which correspond to which byte to emit next, and then we can treat it as a tree exploration problem to avoid all of the above issues.

I’ve currently got a branch exploring just changing the API and keeping the existing implementation to see how feasible rebuilding Hypothesis this way is. So far the answer is that it’ll be a fair bit of work but looks promising. If that promise pans out, one of the above (probably the shadow tree implementation) is likely coming to a Hypothesis near you at some point in 2017.

(Why yes, it is a programming post! I do still do those. I know these are a bit thin on the ground right now. If you want more, consider donating to my Patreon and encouraging me to write more).

This entry was posted in Hypothesis, programming, Python on by .

A simple purely functional data structure for append and forward traversal

For reasons I’ll get into in a later post I’m interested in a data structure with the following properties:

  • It represents a sequence of values
  • It is extremely cheap to append a value to the end in a way that returns an entirely new sequence. The same sequence will typically be extended many times, so no copy on write or append tricks allowed here (no vlists for you) it has to be genuinely cheap
  • Forward traversal is very cheap, especially when the traversal stops after only a small number of elements.
  • It needs to be very simple to implement in a variety of languages including C

The fact that you’re appending to the end and traversing from the beginning is the annoying bit.

The data structure to beat here is to just use a normal linked list, interpreted as storing the list backwards, and then reverse on traversal. If we expected to do a full traversal each time that would probably be pretty close to optimal, but the fact that we often want to traverse only a small number of elements makes it annoying.

(Spoilers for later post: these represent partial paths through a rose tree. The reason why we want fast traversal forwards over a couple of elements is that we want to compare them lexicographically. The reason why we want fast appends is that every time we make a choice we create path prefixes for all the choices we could have taken and didn’t. The reason it needs to be simple is that I might want to use this in Hypothesis, which has “be easily portable to new languages” as an explicit design goal).

The “right” data structure here is probably a finger tree. I don’t want to use a finger tree for one very simple reason: Finger trees are a complete pain to implement in Haskell, let alone another language.

It turns out there’s a data structure that achieves these goals. I haven’t benchmarked it, so unclear at present how useful it is in practice, but for the specific requirements I have it’s probably pretty good. It’s also pretty simple to implement.

The initial idea is that we store our elements as a left-leaning binary tree. That is, a binary tree where the left hand branch always points to a complete binary tree.

Appending then consists of either creating a new split node with the current tree on the left and the new leaf on the right (if the existing tree is already complete), or appending the element to the right hand branch (if the tree is not yet complete then the right hand branch must not yet be complete). i.e. we just do the natural thing for adding it to the rightmost side of a left-leaning binary tree.

This is already pretty good. Append is log(n) in principle, but in practice that side of the binary tree will usually be very cheap. Traversing a binary tree from left to right is pretty straightforward.

There are only two problems with it:

  • Forward traversal isn’t that good, because we have to descend into the deepest part of the tree to get to the first element.
  • We need to decide what sort of book keeping we want to do to check the size of the tree.

We’ll solve the second problem first. We could do this by size annotating every tree node (that was my original idea), but we actually don’t need to because we only need to know the total size of the tree to work out the number of elements below any node: The left hand node always contains the largest power of two strictly smaller than the total number of elements in the tree, the right hand node contains whatever is left.

So instead of size annotating every node we work with bare trees and define a wrapper type which is simply a tree and a count of elements. For some work loads this won’t always be an improvement (because it may reduce sharing), but for most it should.

The first problem is also straightforward to solve (the solution is not one I had seen before, but is certainly not novel to me): We “hoist” the first element of the left subtree up to the split node. So where a binary tree is normally either key list or stores an element between its two subtrees, here we store the leftmost element that “should” be in the left subtree. Note that this means that our balance condition is actually that the number of elements in the left subtree is one fewer than the number of elements in our right subtree (because the element on the current node is implicitly part of the left subtree).

The nice feature of this is that it means that the first element is accessible in O(1). Indeed, the i’th element is accessible in O(min(i, log(n))) – if i < log(n) then finding the i’th element is just a question of reading down the left hand side of the tree, same as it would be if this were a normal cons list.

The following Haskell code (yes I can still write Haskell, thank you very much) provides a minimal implementation of this concept:

This turns out to be very close to an existing purely functional data structure which I hadn’t previously known about – Okasaki’s Purely Functional Random-Access Lists. It’s somewhat simpler in places, mostly because of the operations it doesn’t need to bother supporting, but the binary nodes which contain the first child element is the same and the power-of-two structure is similar but different (in ways where Okasaki’s might just be better. I’m not totally sure, but this is definitely simpler).

I’m still hoping that the CS Theory Stack Exchange will give me a better answer here, but to be honest this is probably more than good enough for my purposes and any further work on this question is basically yak shaving.

This entry was posted in programming on by .

Review of a book that reviews (code) reviewing

In an earlier version of my recent software development advice post, I said the following:

I also think this will not significantly reduce the number of bugs in your software. Code review is just not a cost effective bug finding tool compared to almost anything else.

I appear to be wrong in this claim, if the literature is to be believed. I don’t have a strong reason to doubt the literature, and on examination I’m not actually certain why I believed the contrary (it appears to be cached knowledge), so I’m going to change my opinion on this: Code review is probably a very cost effective way of finding bugs if you do it well, and may be reasonable at it even if you do it badly.

The thing that’s updated my opinion on the subject is reading “Best Kept Secrets of Peer Code Review“, after Richard Barrell sent me a link to a chapter from the middle of it reviewing the literature.

With a couple caveats (the title being terrible definitely counted amongst them), the book is pretty good. There are two chapters worth skipping, but that’s more a function of the fact that the book is from 2006 than anything else (GitHub’s pull requests, while far from a stellar code review system, are in all ways better than what was widely available back in 2006 and for the common path probably not much worse than the state of the art).

The authors clearly have an agenda and the book is there to promote that, but that’s OK. I have an agenda too – it’s very hard to have expertise without also having an agenda in support of that expertise. And even if I believe they’re overstating the case, the case they present is reasonably convincing. It’s also refreshingly empirical – most software engineering advice is full of anecdotes and opinions and, while this book absolutely contains some of those, it’s also got a really pleasant amount of actual research backed by actual data. The data is often not of amazing quality in that many of the studies are quite small scale so the statistics are likely under powered, but that’s still a better standard of evidence than I’m used to.

It’s also pleasantly short. The whole book is only around 160 pages, and I read it over the course of an afternoon.

Anyway, that’s the review section of this post done. Now it’s time for the review section.

I have a couple take homes from this book which are currently in the state of “Plausible sounding statements that I will think about further and probably upgrade most of to actual beliefs”.

  • Code review is a very cost effective way of finding defects compared to manual QA of the application – it both has a higher rate of finding them per unit of time and also finds bugs that QA were probably never going to find.
  • As your notion of “defect” broadens to include things like your coworkers having no idea what you wrote down, the above statement becomes stronger.
  • Code review should be done by individuals (possibly multiple individuals working in parallel) rather than teams working together. As well as being more expensive per unit time, team code review seems to be less effective in absolute terms at finding defects.
  • The generally believed opinion that a 10 line change gets more useful review than a 500 line change seems to be actually true but too generous – the actual cut off seems to be more like 200 lines. It’s unclear to me whether this means 200 lines added or a 200 line diff, but I believe it means the former.
  • Code review effectiveness drops off dramatically as the time taken approaches about an hour. It’s unclear to me if you can fix this by walking away from the review and coming back later. This may also be the origin of the 200 lines limit – it may be impossible for most people to effectively review code at a rate much higher than 200 lines / hour.
  • Code review benefits greatly from a careful first reading of code before you do any reviewing (and doing this stage more slowly is better. Presumably this hits diminishing returns at some point).
  • Code review benefits greatly from being structured with checklists.

The checklist one is interesting, and I am actually pretty convinced by it: The overall literature on checklists helping construct reliable processes seems to be reasonably convincing (but my opinion here is very trend driven and I have not actually read any primary research on this subject. It conforms to my biases about the world though, and thus must be true), and the specific data presented in favour of checklists for reviewing is moderately convincing. This is the advice that is the most likely to make an actual practical difference to how I review code in future.

The advice on checklists is interesting in particular in how it’s constructed: They strongly recommend not having just a “standard” checklist, but actively curating one over time: When something is found in review that seems like it would make a good check list item, add it to the list! When an item hasn’t found anything useful in review in a while (say 6 months), think about removing it.

Here are a couple example check list items from the book that look generally useful to me (the example list this is from is much longer):

  • Documentation: All subroutines are commented in clear language
  • Documentation: Describe what happens with corner-case input.
  • Testing: Unit tests are added for new code paths or behaviours.
  • Testing: Unit tests cover errors and invalid parameters.
  • Testing: Don’t write new code that is already implemented in an existing, tested API.
  • Error handling: Invalid parameter values are handled properly early in the subroutine.

There are plenty more. The book suggests that checklists should really only have about the top 10-20 items that you’ve found most useful over time (the sample checklist has 25 items, so apparently they didn’t test the invalid input case here).

One interesting thing worth highlighting is that at least some checklist items may be worth automating out of the review entirely. e.g. “Testing: Unit tests are added for new code paths or behaviours” would mostly be better handled by enforcing a code coverage metric I think.

As well as improving the quality of the review itself, the book highlights another interesting application of checklists: It doesn’t have to just be the reviewer who goes over them! Doing a sort of self review by following the checklist yourself before submitting seems to be nearly as effective at removing bugs as the review itself would be (it is unclear to me if that remains true if you skip the review altogether – my feeling is that people are likely to get sloppy at the checklist if they imagine someone is not going to be checking it later). This presumably speeds up the review process a lot by reducing the number of back and forth exchanges required to pass it.

One suggestion off the back of this self review that they made (which seems to come from the Personal Software Process world) is that as well as maintaining a global checklist for reviews it might be worth maintaining individual checklists for reviews, where people maintain their own list of things that it’s worth them pre-checking because they often make a mistake in that area which then comes up in review.

Anyway, I’m really glad I read the book. I can’t swear to its correctness, but it’s an interesting and plausible perspective that I’ll be thinking about further.

PS. If you’d like to fuel my book habit, I have a public Amazon wishlist. I’ll happily review any books you buy me…

This entry was posted in Books, programming on by .

Why you should use a single repository for all your company’s projects

In my post about things that might help you write better software, a couple points are controversial. Most of them I think are controversial for uninteresting reasons, but monorepos (putting all your code in one repository) are controversial for interesting reasons.

With monorepos the advice is controversial because monorepos are good at everything you might reasonably think they’re bad at, and multiple repos per project are bad at everything you might reasonably think they’re good at.

First a note: When I am talking about a monorepo I do not mean that you should have one undifferentiated ball o’ stuff where all your projects merge into one. The point is not that you should have a single project, but that one repository can contain multiple distinct projects.

In particular a monorepo should be organised with a number of subdirectories that each look more or less like the root directory of what would have otherwise been its own repo (possibly with additional directories for grouping projects together, though for small to medium sized companies I wouldn’t bother).

The root of your monorepo should have very little in it – A README, some useful scripts for managing workflows maybe, etc. It certainly shouldn’t have a “src” directory or whatever the local equivalent where you put your code is.

But given this distinction, I think almost everyone will benefit from organising their code this way.

For me the single biggest advantages of this sort of organisation style are:

  1. It is impossible for your code to get out of sync with itself.
  2. Any change can be considered and reviewed as a single atomic unit.
  3. Refactoring to modularity becomes cheap.

There are other advantages (and some disadvantages that I’ll get to later), but those are both necessary and sufficient for me: On their own they’re enough to justify the move to a monorepo, and without them I probably wouldn’t be convinced it was worth it.

Lets break them down further:

It is impossible for your code to get out of sync with itself

This is relatively straightforward, but is the precursor to the other benefits.

When you have multiple projects across multiple repos, “Am I using the right version of this code?” is a question you always have to be asking yourself – if you’ve made a change in one repository, is it picked up in a dependent one? If someone else has made changes in two repositories have you updated both of them? etc.

You can partly fix this with tooling, but people mostly don’t and when they do the tooling is rarely perfect, so it remains a constant low grade annoyance and source of wasted time.

With a monorepo this just doesn’t happen. You are always using the right version of the code because it’s right there in your local repo. Everything is kept entirely self-consistent because there is only a single coherent notion of version.

Any change can be considered and reviewed as a single atomic unit

This is essentially a consequence of the “single consistent version” feature, but it’s an important one: If you have a single notion of version, you have a single notion of change.

This is a big deal for reviewing and deploying code. The benefit for deploying is straightforward – you now just have a single notion of version to be put on a server, and you know that version has been tested against itself.

The benefit for reviews is more interesting.

How many times have you seen a pull request that says “This pull request depends on these pull requests in another repo”?

I’ve seen it happen a lot. When you’ve got well-factored libraries and/or services then it’s basically routine to want to add a feature to them at the same time as adding a consumer of that feature.

As well as adding significant overhead to the process by splitting it into multiple steps where it’s logically a single step, I find this makes code review significantly worse: You often end up either having to review something without the context needed to understand it or constantly switching between them.

This can in principle be fixed with better tooling to support multi-repo review, but review tools don’t seem to support that well at the moment, and at that point it really feels like trying to emulate a monorepo on top of a collection of repos just so you can say you don’t have one.

Refactoring to modularity becomes cheap

This is by far the biggest advantage of a monorepo for me, and is the one that I think is the most counter-intuitive to people.

People mostly seem to want multiple repositories for modularity, but multiple repositories actually significantly hurt modularity compared to a monorepo.

There are two major reasons for this:

The first builds on the previous two features of a monorepo: Because multiple repositories add friction, if repositories are the basis of your modularity then every time you want to make things more modular by extracting a library, a sub-project, etc. you are adding significant overhead: Now you have two things to keep in sync. Additionally it’s compounded by the previous problems directly: If I have some code I want to extract from one project into a common library, how on earth do I juggle the versions and reviews for that in a way that isn’t going to mess everyone up? It’s certainly manageable, but it’s enough of a pain that you will be reluctant to do what should be an easy thing.

The second is that by enforcing boundaries between projects across which it is difficult to refactor across you end up building up cruft along the boundary: If project A depends on project B, what will tend to happen if they are in separate repos is that A will build up an implicit “B-compatibility layer” of things that should have happened in B but it was just slightly too much work. This both means that different users of B end up duplicating work, and also often makes it much harder to make changes to B because you’d then need to change all the different weird compatibility layers at once.

In a monorepo this doesn’t happen: If you want to make an improvement to B as part of a change to A, you just do so as part of the same change – there’s no problem (If B has a lot of other dependants there might be a problem if you’re changing rather than adding things, but the build will tell you). Everyone then benefits, and the cruft is minimised.

I’ve seen this born out in practice multiple times, at both large and small scales: Well organised single repositories actually create much more modular code than trying to split it out into multiple repositories would allow for.

Reasons you might still want to have multiple repositories

It has almost always been my experience that multiple repositories are better consolidated, but there are a few exceptions.

The first is when your company’s code is partly but not entirely open source. In this case it is probably useful to have one or more repositories for your open source projects. This doesn’t make the problems of multiple repositories go away mind you, it just means that you can’t currently use the better system!

Similarly, if you’re doing client work where you are assigning copyright to someone else then you should probably keep per client repos separate.

Ideally I’d like to solve this problem with tooling that made it better to mirror directories of a monorepo as individual projects, but I’m not aware of any good systems that do that right now.

The other reason you might want to have multiple repositories is if your codebase is really large. The Linux kernel is about 15 million lines of code and works more or less fine as a single git repository, so that that’s a rough idea of the scale of what “large” means (if you have non text assets checked in you may hit that limit faster, but I’ve not seen that be much of a problem in practice).

This is another thing that should be fixable with tooling. To some extent it’s in the process of being fixed: Various large companies are investing in the ability of git and mercurial to handle larger scales.

If you do not currently have this problem you should not worry about having this problem. It will take you a very long time to get there, and if the tools haven’t improved sufficiently by the time you do then you can almost certainly afford to pay someone to customise them to your use case like the current generation of large companies are doing.

The final thing that might cause you to stick with multiple repos is the tooling you’ve built around your current multi repo set up. In the long run I think you’d still benefit from a monorepo, but if all your deploys, CI, etc. have multi repo assumptions baked into them then it can significantly increase the cost of migrating.

What to do now

Embrace the monorepo. The monorepo is your friend and wants you to be happy.

More seriously, the nice thing about this observation is that it doesn’t require you to go all in to get the benefits. All of the above benefits of having one repository also extend to having fewer repositories. So start reducing your repository count.

The first and easiest thing to do is just stop creating new repositories. Either create a new monorepo or designate some existing large project as the new monorepo by moving its current contents into a subdirectory. All new things that you’d previously have created a repository for now go as new directories in there.

Now move existing projects into a subdirectory as and when is convenient. e.g. before starting a large chunk of work that touches multiple repositories. Supposedly if you’re using git you can do this in a way that preserves their history, though when I’ve done it in the past I have typically just thrown away the history (or rather, kept the old repository around as read only for when I wanted to consult the history).

This may require some modifications to your deployment and build scripts to tie everything together, but it should be minor and most of the difficulty will come the first time you do it.

And you should feel the benefit almost  immediately. Whenever I’ve done this it’s felt like an absolute breath of fresh air, and has immediately made me happier.

 

 

This entry was posted in programming on by .

Some things that might help you make better software

I’ve argued before that most software is broken because of economic reasons. Software which is broken because there is no incentive to ship good software is going to stay broken until we manage to change those incentives. Without that  there will be no budget for quality, and nothing you can do is going to fix it.

But suppose you’re in the slightly happier place where you do have budget for quality? What then? What can you do to make sure you’re actually spending that budget effectively and getting the best software you could be getting out of it?

I don’t have an easy answer to that, and I suspect none exists, but I’ve been doing this software thing for long enough now that I’ve picked up some things that seem to help quality without hurting (and ideally helping) productivity. I thought it would be worth writing them down.

Many of them will be obvious or uncontroversial, but if you’re already doing all of them then your team is probably doing very well.

This is all based somewhat on anecdote and conjecture, and it’s all coloured by my personal focuses and biases, so some of it is bound to be wrong. However I’m pretty sure it’s more right than wrong and that the net effect would be strongly positive.

Without further ado, here is my advice.

Attitude

If you do not care about developing quality software you will not get quality software no matter what your tools and processes are designed to give you.

This isn’t just about your developers either. If you do not reward the behaviour that is required to produce quality software, you will not get quality software. People can read their managers’ minds, and if you say you want quality software but reward people for pushing out barely functioning rubbish, people are smart enough to figure out you don’t really mean that.

Estimated cost: Impossible to buy, but hopefully if you’re reading this article you’re already there. If you’re embedded in a larger context that isn’t, try creating little islands of good behaviour and see if you can bring other people around to your way of thinking.

Estimated benefit: On it’s own, only low to moderate – intent doesn’t do much without the ability to act – but it is the necessary precursor to everything else.

Controversy level: Probably most people agree with this, although equally it doesn’t feel like most people implement this. I imagine there are some people who think they can fix this problem if they just find the right process. Maybe they’re right, but I’ve never seen something even approaching such a process.

Automated Testing

Obviously I have quite a few thoughts about automated testing, so this section gets a lot of sub headings.

Continuous Integration

You need to be running automated tests in some sort of CI server that checks every ostensibly releasable piece of software and checks whether it passes the tests.

If you’re not doing this, just stop reading this article and go set it up right now, because it’s fundamental. Add a test that just fires up your website and requests the home page (or some equivalent if you’re not writing a website). You’ve just taken the first and most important step on the road from throwing some crap over the wall and seeing if anyone on the other side complains about it landing on them to actual functional software development.

Estimated cost: Hopefully at worst a couple days initial outlay to get this set up, then small to none ongoing.

Estimated benefit: Look, just go do this already. It will give you a significant quality and productivity boost.

Controversy level: It would be nice to think this was uncontroversial. It’s certainly well established best practice, but I’ve worked at companies that don’t do it (a while ago), and a friend basically had to ramrod through getting this implemented at the company they’d joined recently.

Local Automated Testing

You need to be able to run a specific test (and ideally the whole test suite) against your local changes.

It doesn’t really matter if it runs actually on your local computer, but it does matter that it runs fast. Fast feedback loops while you work are incredibly important. In many ways the length of time to run a single test against my local changes is the biggest predictor of my productivity on a project.

Ideally you need to  be able to select a coherent group of tests (all tests in this file, all tests with this tag) and run just those tests. Even better, you should be able to run just the subset of whatever tests you ask to run that failed last time. If you’re using Python, I recommend py.test as supporting these features. If you’re currently using unittest you can probably just start using it as an external runner without any changes to your code.

Estimated cost: Depends on available tooling and project. For some projects it may be prohibitively difficult (e.g. if your project requires an entire hadoop cluster to run the code you’re working on), but for most it should be cheap to free.

Estimated benefit: Similarly “look, just go do this already” if you can’t run a single test locally. More specific improvements will give you a modest improvement in productivity and maybe some improvement in quality if they make you more likely to write good tests, which they probably will.

Controversy level: Not very. I’ve only worked at one company where running tests locally wasn’t a supported workflow, and I fixed that, but workflows which support my slightly obsessive focus on speed of running a single test are rarely as good as I’d like them to be.

Regression Testing

The only specific type of automated testing that I believe that absolutely everybody should be doing is regression testing: If you find a bug in production, write a test that detects that bug before you try to fix the bug. Ideally write two tests: One that is as general as possible, one that is as specific as possible. Call them an integration and a unit test if that’s your thing.

This isn’t just a quality improvement, it’s a productivity improvement. Trying to fix bugs without a reproducible example of that bug is just going to waste your time, and writing a test is the best way to get a reproducible example.

Estimated cost: Zero, assuming you have local testing set up. This is just what you should be doing when fixing bugs because it’s better than the other ways of fixing bugs – it will result in faster development and actually fixed bugs.

Estimated benefit: Depends how much time you spend fixing bugs already, but it will make that process faster and will help ensure you don’t have to repeat the process. It will probably improve quality somewhat by virtue of preventing regressions and also ensuring that buggier areas of the code are better tested.

Controversy level: In theory, not at all. In practice, I’ve found many to most developers need continual reminders that this is a thing you have to do.

Code Coverage

You should be tracking code coverage. Code coverage is how you know code is tested. Code being tested is how you know that it is maybe not completely broken.

It’s OK to have untested code. A lot of code isn’t very important, or is difficult enough to test that it’s not worth the effort, or some combination of the two.

But if you’re not tracking code coverage then you don’t know which parts of your code you have decided you are OK with being broken.

People obsess a lot about code coverage as a percentage, and that’s understandable given that’s the easiest thing to get out of it, but in many ways it’s the least important part of code coverage. Even the percentage broken down by file is more interesting that, but really the annotated view of your code is the most important part because it tells you which parts of your system are not tested.

My favourite way to use code coverage is to insist on 100% code coverage for anything that is not explicitly annotated as not requiring coverage, which makes it very visible in the code if something is untested. Ideally every pragma to skip coverage would also have a comment with it explaining why, but I’m not very good about that.

As a transitional step to get there, I recommend using something like diff-cover or coveralls which let you set up a ratcheting rule in your build that prevents you from decreasing the amount of code coverage.

Estimated cost: If your language has good tooling for coverage, maybe a couple hours to set up. Long-term, essentially free.

Estimated benefit: On its own, small, but it can be a large part of shifting to a culture of good testing, which will have a modest to large effect.

Controversy level: Surprisingly high. Of the companies I’ve worked at precisely zero have tracked code coverage (in one case there was a push for it but younger me argued against it. My opinions on testing have changed a lot over the years).

Property-based Testing

Property-based testing is very good at shifting the cost-benefit ratio of testing, because it somewhat reduces the effort to write what is effectively a larger number of tests and increases the number of defects those tests will find.

I won’t write too much about this here because I have an entire separate site about this.

Estimated cost: If you’re using a language with a good property based testing tool, about 15 minutes to install the package and write your first test. After that, free to negative. If you’re not, estimated cost to pay me to port Hypothesis to a new language is around £15-20k.

Estimated benefit: You will find a lot more bugs. Whether this results in a quality improvement depends on whether you actually care about fixing those bugs. If you do, you’ll see a modest to large quality improvement. You should also see a small to modest productivity improvement if you’re spending a lot of time on testing already.

Controversy level: Not very high, but niche enough that most people haven’t formed an opinion on it. Most people think property based testing is amazing when they encounter it. Some push back on test speed and non-determinism (both of which have partial to complete workarounds in Hypothesis at least)

Manual Testing

At a previous job I found a bug in almost every piece of code I reviewed. I used a shocking and complicated advanced technique to do this: I fired up the application with the code change and tried out the feature.

Manual testing is very underrated. You don’t have to have a dedicated QA professional on your team to do it (though I suspect it helps a lot if you do), but new features should have a certain amount of exploratory manual testing done by someone who didn’t develop them – whether it’s another developer, a customer service person, whatever. This will find both actual bugs and also give you a better idea of its usability.

And then if they do find bugs those bugs should turn into automated regression tests.

Estimated cost: It involves people doing stuff on an ongoing basis, so it’s on the high side because people are expensive, but it doesn’t have to be that high to get a significant benefit to it. You can probably do quite well with half an hour of testing of a feature that took days to develop. This may also require infrastructure changes to make it easy to do that can have varying levels of cost and difficulty, but worst case scenario you can do it on the live system.

Estimated benefit: You will almost certainly get a moderate quality improvement out of doing this.

Controversy level: Having QA professionals seems to be entirely against the accepted best practice in startups. The rest, similar to regression testing: Doing a bit of manual testing seems to be one of those things where people say “Of course we do that” and then don’t do it.

Version Control

You need to be using a version control system with good branching and merging. This is one of the few pieces of advice where my recommendation requires making a really large change to your existing workflow.

I hope that it’s relatively uncontroversial that you should be using version control (not everybody is!). Ideally you should be using good version control. I don’t really care if you use git, mercurial, fossil, darcs, whatever. We could get into a heated argument about which is better but it’s mostly narcissism of small differences at this point.

But you should probably move off SVN if you’re still on it and you should definitely move off CVS if you’re still on it. If you’re using visual source safe you have my sympathies.

The reason is simple: If you’re working on a team of more than one person, you need to be able to incorporate each other’s changes easily, and you need to be able to do that without trashing your own work. If you can’t, you’re going to end up wasting a lot of your time.

Estimated cost: Too project dependent to say. Importer tools are pretty good, but the real obstacle is always going to be the ecosystem you’ve built around the tools. At best you’re going to have a bad few weeks or months while people get used to the new system.

Estimated benefit: Moderate to large. Many classes of problems will just go away and you will end up with a much more productive team who find it much easier to collaborate.

Controversy level: Basically uncontroversial. Not as widespread as you might imagine, but not controversial. Once git started becoming popular basically everywhere I’ve worked used it (with one exception for mercurial and one exception for Google’s interesting perforce ish based system).

Monorepos

Use a single repository for all your code.

It’s tempting to split your projects into lots of small repos for libraries and services, but it’s almost always a bad idea. It significantly constrains your ability to refactor across the boundary and makes coordinating changes to different parts of the code much harder in almost every way, especially with most standard tooling.

If you’re already doing this, this is easy. Just don’t change.

If you’re not, just start by either creating or designating an existing repository as the monorepo and gradually move the contents of other repos into it as and when convenient.

The only exception where you probably need to avoid this is specific projects you’re open sourcing, but even then it might be worth developing them in the monorepo with some sort of external repo.

This point has proved controversial, so if you’re still unconvinced I have written a longer advocacy piece on why you should use a monorepo.

Estimated costs: Too project dependent to say, but can be easily amortised over time.

Estimated benefits: Every time you do something that would have required touching two repos at once, your life will be slightly easier because you are not paying coordination costs. Depends on how frequent that is, but experience suggests it’s at least a modest improvement.

Controversy level: High. This piece of advice is extremely love/hate. I think most of the people who love it are the ones who have tried it at least once and most of the people who hate it are those who haven’t, but that might be my biases speaking. It’s been pretty popular where I’ve seen it implemented.

Static Analysis

I do not know what the right amount of static analysis is, but I’m essentially certain that it’s not none. I would not be surprised to learn that the right amount was quite high and includes a type system of some sort, but I don’t know (I also would not be surprised to discover that it was not). However even very dynamic languages admit some amount of static analysis and there are usually tools for it that are worth using.

I largely don’t think of this as a quality thing though. It’s much more a productivity improvement.  Unless you are using a language that actively tries to sabotage you (e.g. C, JavaScript), or you have a really atypically good static analysis system that does much more work than the ones I’m used to (I’m actually not aware of any of these that aren’t for C and/or C++ except for languages with actual advanced type systems), static analysis is probably not going to catch bugs much more effectively than a similar level of good testing.

But what it does do is catch those bugs sooner and localise them better. This significantly improves the feedback loop of development and stops you wasting time debugging silly mistakes.

There are two places that static analysis is particularly useful:

  1. In your editor. I use syntastic because I started using vim a decade ago and haven’t figured out how to quit yet, but your favourite editor and/or IDE will likely have something similar (e.g. The Other Text Editor has flycheck). This is a really good way of integrating lightweight static analysis into your workflow without having to make any major changes.
  2. In CI. The ideal number of static analysis errors in your project is zero (this is true even when the static analysis system has false positives in my opinion, with the occasional judicious use of ‘ignore this line’ pragmas), but you can use the same tricks as with code coverage to ratchet them down to zero from wherever you’re starting.

Most languages will have at least a basic linting tool you can use, and with compiled languages the compiler probably has warning flags you can turn on. Both are good sources of static analysis that shouldn’t require too much effort to get started with.

Estimated cost: To use it in your editor, low (you can probably get it set up in 10 minutes). To use it in your CI, higher but still not substantial. However depending on the tool it may require some tuning to get it usable, which can take longer.

Estimated benefit: Depends on the tool and the language, but I think you’ll get a modest productivity boost from incorporating static analysis and may get a modest to large quality boost depending on the language (in Python I don’t think you’ll get much of a quality benefit. In C I think you’ll get a huge one even with just compiler warnings).

Controversy level: Varies entirely depending on level of static analysis. Things that you could reasonably describe as “linting” are low. Things that require something closer to a type system much higher. Tools with a high level of false positives also high. You can definitely find an uncontroversial but still useful level of static analysis. I’ve seen it at a moderate subset of the companies I worked for.

Production Error Monitoring

You should have some sort of system that logs all errors in production to something more interactive than a log file sitting on a server somewhere. If you’re running software locally on end users’ computers this may be a bit more involved and should require end user consent, but if you’re writing a web application we’re all used to being pervasively spied on in everything we do anyway so who cares?

I’ve used and like Sentry for this. There are other options, but I don’t have a strong opinion about it.

Estimated cost: Depends on setup, but getting started with sentry is easy and it doesn’t cost a particularly large amount per month (or you can host the open source version for the cost of a server).

Estimated benefit: Much better visibility of how broken your software is in production is the best thing for making your software less broken in production. It will also speed up your debugging process a lot when you do have production errors to fix, so it’s probably a net win in productivity too if you spend much time debugging production errors (and you probably do).

Controversy level: Low but surprisingly it’s not nearly as widely implemented as you might expect. Another thing that is becoming more common though I think.

Assertions

I am a big fan of widespread use of assertions, and of leaving them on in production code.

The main reason for this is simple: The single biggest factor in ease of debugging is making sure that the point at which the error is reported is as close as possible to the point at which the error occurs. Assertions are a very good way to do this because they turn a failure of understanding into a runtime error: If your code is not behaving in a way you’d expect, that becomes an error immediately, and it is much easier to debug than finding the downstream thing that actually went wrong at some point later.

It also has a huge benefit when doing property-based testing, because they greatly increase the scope of properties tested – problems that might not have been noticed by the explicit test become much more visible if they trigger an assertion failure.

Input validation while technically not an assertion also has the same effect – a function which checks its arguments rather than silently doing the wrong thing when given a wrong argument will be significantly easier to debug.

John Regehr has a good post on the care and feeding of assertions that I recommend reading further.

Estimated cost: Low if you just start adding them in as you develop and edit code. Requires a bit of careful thinking about what the code is doing, but that’s no bad thing.

Estimated benefit: Modest. This won’t be life changingly good, but I have frequently been grateful for a well placed assertion in my code preventing what would otherwise be a much more confusing bug.

Controversy level: People don’t really seem to have an opinion on this one way or another, but it’s not a common habit at all. I’ve not seen it be widespread at any company I’ve worked for.

Code Review

I think all projects with > 1 person on them should put all code changes through code review.

Code review seems to be a fairly cost effective defect finding tool according to the literature. I previously believed this not to be the case, but I’ve done some reading and I changed my mind.

But regardless of whether you find defects, it will ensure two  very important things:

  1. At least one other person understands this code. This is useful both for bus factor and because it ensures that you have written code that at least one other person can understand.
  2. At least one other person thinks that shipping this code is a good idea. This is good both for cross checking but also because it forces you to sit down and think about what you ship. This is quite important: Fast feedback loops are good for development, but slow feedback loops for shipping make you pause and think about it.

Over time this will lead to a significantly more maintainable and well designed piece of software.

Estimated cost: You need to get a code review system set up, which is a modest investment and may be trivial. I can’t really recommend anything in this space as the only things I’ve used for this are Github and proprietary internal systems. Once you’ve got that, the ongoing cost is actually quite high because it requires the intervention of an actual human being on each change.

Estimated benefit: It’s hard to say. I have never been part of a code review process that I didn’t think was worth it, but I don’t have a good way of backing that up with measurements. It also depends a lot on the team – this is a good way of dealing with people with different levels of experience and conscientiousness.

Controversy level: Fairly uncontroversial, though at least amongst small companies it used to be weird and unusual. At some point in my career it went from “nobody does this” to “everybody does this”. I think a combination of GitHub pull requests and acknowledgement that most of the cool big companies do it seems to have taken this from a niche opinion to widespread practice in a remarkably short number of years.

Continuous Delivery

Another part of localising things to when they went wrong is that ideally once something has passed code review it will ship as soon as possible. Ideally you would ship each change as its own separate release, but that isn’t always practical if you’re e.g. shipping client side software.

This helps ensure that when something goes wrong you have a very good idea of what caused it because not that much changed.

Another important part of this is that when a release goes out you should always be able to roll it back easily. This is essential if you want to make releasing low cost, which is in turn essential for having this sort of frequent release.

A thing I have never worked with personally but have regarded with envy is staged roll out systems which first roll out to a small fraction of the customer base and then gradually ratchet up until it reaches 100%, rolling back automatically or semi-automatically if anything seems to have gone wrong in the process.

Estimated cost: The transitional period from infrequent to frequent deliveries can be a bit rough – you’ll need to spend time automating manual steps, looking for inefficiencies, etc. Take baby steps and gradually try to improve your frequency over time and you can spread this out fairly easily though.

Estimated benefit: A modest quality improvement, and quite a large improvement in debugging time if you currently end up with a lot of broken releases. The release process changes you have to make to make this work will probably also be a significant net time saver.

Controversy level: I’m not sure. It hasn’t seemed that controversial where I’ve seen it implemented, but I think larger companies are more likely to hate it.

Auto formatting and style checking

Code review is great, but it has one massive failure mode. Consider Wadler’s law:

In any language design, the total time spent discussing a feature in this list is proportional to two raised to the power of its position.
0. Semantics
1. Syntax
2. Lexical syntax
3. Lexical syntax of comments

Basically the same thing will happen with code review. People will spend endless time arguing about style checking, layout, etc.

This stuff matters a bit, but it doesn’t matter a lot, and the back and forth of code review is relatively expensive.

Fortunately computers are good at handling it. Just use an auto-formatter plus a style checker. Enforce that these are applied (style checking is technically a subset of static analysis but it’s a really boring subset and there’s not that much overlap in tools).

In Python land I currently use pyformat and isort for auto-formatting and flake8 for style checking. I would like to use something stronger for formatting – pyformat is quite light touch in terms of how much it formats your code. clang-format is extremely good and is just about the only thing I miss about writing C++. I look forward to yapf being as good, but don’t currently find it to be there yet (I need to rerun a variant on my bug finding mission I did for it last year at some point). gofmt is nearly the only thing about Go I am genuinely envious of.

Ideally you would have your entire project be a fixed point of the code formatter. That’s what I do for Hypothesis. If you haven’t historically done that it can be a pain though. Many formatting tools can be applied based on only the edited subset of the code. If you’re lucky enough to have one of those, make that part of your build process and have it automatically enforced.

Once you have this, you can now institute a rule that there should be no formatting discussion in code review because that’s the computer’s job.

Here’s a great post from GDS about this technique and how it’s helped them.

Estimated cost: Mostly tool dependent, but if you’re lucky it’s basically free. Also some social cost – some people really dislike using style checkers (and to a lesser degree auto-formatters) for reasons that don’t make much sense to me. I personally think the solution is for them to get over it, but it may not be worth the effort of fighting over it.

Estimated benefit: From the increased consistency of your code, small but noticeable. The effect on code review is moderate to large, both in terms of time taken and quality of review.

Controversy level: Surprisingly high. Some people really hate this advice. Even more people hate this advice if you’re not running a formatter that guarantees style conforming code (e.g. I’m not on Hypothesis because none of the Python code formatters can yet). I’ve only really seen this applied successfully at work once.

Documentation in the Repository

You should have a docs section in your repository with prose written about your code. It doesn’t go in a wiki. Wikis are the RCS of documentation. We’ve already established you should be using good version control and a monorepo, so why would you put your documentation in RCS?

Ideally your docs should use something like sphinx so that they compile to a (possibly internally hosted) site you can just access.

It’s hard to keep documentation up to date, I know, but it’s really worth it. At a bare minimum I think your documentation should include:

  • Up to date instructions for how to get started developing with your code
  • Detailed answers to questions you find yourselves answering a lot
  • Detailed post-mortems of major incidents with your product

For most projects they should also include a change log which is updated as part of each pull request/change list/whatever.

It may also be worth using the documentation as a form of “internal blogging” where people write essays about things they’ve discovered about the problem domain, the tools you’re using or the local style of work.

Estimated cost: Low initial setup. Writing documentation does take a fair bit of time though, so it’s not cheap.

Estimated benefit: This has a huge robustness benefit, especially every time your team changes structure or someone needs to work on a new area of the code base. How much benefit you’ll derive varies depending on that, but it’s never none – if nothing else, everybody forgets things they don’t do often, but also the process of writing the documentation can hugely help the author’s understanding.

Controversy level: Another case of “most people probably agree this is a good idea but don’t do it”. Unless you’ve got someone pushing for it a lot, documentation tends to be allowed to slide. I’ve never really seen this work at any of the company’s I’ve worked for.

Plan to always have more capacity than work

Nick Stenning made an excellent point on this recently: If your team is always working at full capacity then delays in responding to changes will sky rocket, even if they’re coming in at a rate you can handle.

As well as that, it tends to mean that maintenance tasks that can greatly improve your productivity will never get done – almost every project has a back log of things that are really annoying the developers that they’d like to fix at some point and never get around to. Downtime is an opportunity to work on that.

This doesn’t require some sort of formal 20% time arrangement, it just requires not trying to fit a quart into a pint pot. In particular, if you find you’ve scheduled more work than got done, that’s not a sign that you slightly over estimated the amount you could get done that’s a sign that you scheduled significantly too much work.

Estimated Cost: Quite expensive. Even if you don’t formally have 20% time, you’re probably still going to want to spend about 20% of your engineering capacity this way. It may also require significant experimentation to get your planning process good enough to stop overestimating your capabilities.

Estimated Benefit: You will be better able to respond to changes quickly and your team will almost certainly get more done than they were previously getting done in their 100% time.

Controversy level: Fairly high. Almost everywhere I’ve worked the team has consistently planned more work than they have capacity for.

Projects should be structured as collections of libraries

Modularity is somewhat overrated, but it’s not very overrated, and the best way of getting it is to structure things as libraries. The best way to organise your project is not as a big pot of code, but as a large number of small libraries with explicit dependencies between them.

This works really well, is easy to do, and helps keep things clean and easy to understand while providing push back against it all collapsing into a tangled mess.

There are systems like bazel that are specifically designed around structuring your project this way. I don’t have very fond memories of its origin system, and I’ve not used the open source version yet, but it is a good way of enforcing a good build structure. Otherwise the best way to do this is probably just to create subdirectories and use your language’s standard packing tools (which probably include a development mode for local development. e.g. pip install -e if you’re using Python).

Some people may be tempted to do this as microservices instead, which is a great way to get all the benefits of libraries alongside all the benefits of having an unreliable network and a complicated and fragile deployment system.  There are some good reasons to use microservices in some situations, but using them purely as a way to achieve modularity is just a bad idea.

Estimated cost: Quite low. Just start small – factor out bits of code and start new code in its own library. Evolve it over time.

Estimated benefit: Small to moderate productivity enhancement. Not likely to have a massive impact on quality, but it does make testing easier so it should have some.

Controversy level: Fairly low. I’m not sure people have an opinion on this one way or the other.

Ubiquitous working from home

I’m actually not a huge fan of massively distributed teams, mostly because of time zones. It tends to make late or early meetings a regular feature of peoples’ lives. I could pretend to be altruistic and claim that I disapprove of that because it’s bad for people with kids, which it is, but I also just really hate having to do those myself.

But the ability to work from home is absolutely essential to a productive work environment, for a number of reasons:

  1. Open plan offices are terrible. They are noisy distraction zones that make it impossible to get productive work done. Unfortunately, this battle is lost. For whatever reason the consensus is that it’s more cost effective to cram developers in at 50% efficiency than it is to pay for office space. This may even be true. But working from home generally solves this by giving people a better work environment that the company doesn’t have to pay for.
  2. Requiring physical presence is a great way for your work force to be constantly sick! People can and should take sick days, but if people cannot work from home then they will inevitably come in when they feel well enough to work but are nevertheless contagious. This will result in other people becoming ill, which will either result in them coming and spreading more disease or staying home and getting nothing done. Being able to work from home significantly reduces the incentive to come in while sick.
  3. Not having physical access to people will tend to improve your communication patterns to be lower interrupt and more documentation driven, which makes them work better for everyone both in the office and not.

I do not know what the ideal fraction of work from home to work in the office is, but I’d bet money that if most people are not spending at least two days per week working from home then they would benefit from spending more. Also, things will tend to work better as the fraction increases: If you only have a few people working from home at any given point, the office culture will tend to exclude them. As you shift to it being more normal, work patterns will adapt to accommodate them better.

Estimated cost: There may be some technical cost to set this up – e.g. you might need to start running a VPN – but otherwise fairly low. However there may be quite a lot of political and social push back on this one, so you’re going to need a fair bit of buy in to get it done.

Estimated benefit: Depends on team size and current environment, but potentially very large productivity increase.

Controversy level: Fairly low amongst developers, fairly high amongst the non-developers who you’ll probably need to get sign off on it.

No Long Working Hours

Working longer work weeks does not make for more productive employees, it just results in burning out, less effective work and spending more time in the office failing to get anything done. Don’t have work environments that encourage it.

In fact, it’s better if you don’t have work environments that allow it, because it will tend to result in environments where it goes from optional to implicitly mandatory due to who gets rewarded. It’s that reading managers’ minds thing again.

Estimated cost: Same as working from home: Low, but may require difficult to obtain buy in. Will probably also result in a transitional period of lower productivity while people are still burned out but less able to paper over it.

Estimated benefit: High productivity benefits, high quality benefits. Exhausted people do worse work.

Controversy level: High. Depending on who you’re talking to this is either obviously the correct thing to do or basically communism (there may also be some people who think it’s basically communism and that’s why they like it).

Good Work Culture

Or the “don’t work with jerks” rule.

People need to be able to ask questions without being afraid. People need to be able to give and receive feedback without being condescending or worrying that the other side will blow up at them or belittle them. People need to be able to take risks and be seen to fail without being afraid of how much it will cost them.

There are two major components to this:

  1. Everyone needs to be on board with it and work towards it. You don’t need everyone to be exquisitely polite with everyone else at all times – a certain amount of directness is usually quite beneficial – but you do need to be helpful, constructive and not make things personal.
  2. Some people are jerks and you should fire them if they don’t fix their ways.

It is really hard to do the second one, and most people don’t manage, but it’s also really important. Try to help them fix their ways first, but be prepared to let them go if you can’t, because if you’ve got a high performing jerk on the team it might be that they’re only or in large part high performing because they’re making everyone else perform worse. Even if they really are that good, they’re probably not good enough to justify the reduced productivity from everyone else.

Note: This includes not just developers but also everyone else in the company.

Estimated cost: High. Changing culture is hard. Firing people is hard, especially if they’re people who as individual performers might look like your best employees.

Estimated benefit: Depending on how bad things are currently, potentially extremely high. It will bring everyone’s productivity up and it will improve employee retention.

Controversy level: Another “Not controversial but people don’t actually do it”. I’ve mostly seen the end result of jerks leaving under their own volition and everyone breathing a sigh of relief and experiencing a productivity boost.

Good Skill Set Mixing

You generally want to avoid both silos and low bus factors.

In order to do that, it’s important to have both overlapping and complementary skills on your team: A good rule of thumb is that any task should have at least two people who can do it, and any two people should have a number of significant tasks where one would obviously be better suited to work on it than another. The former is much more important than the latter, but both are important.

Having overlapping skills is important because it increases your resilience and capacity significantly: If someone is out sick or on holiday you may be at reduced capacity but there’s nothing you can’t do. It also means there is always a second perspective you can get on any problem you’re stuck with.

Having complementary skills is important because that’s how you expand capabilities: Two people with overlapping skills are much better able to work together than two people with nothing in common, but two people with identical skills will not be much better working together than either of them individually. On the other hand two people working together who have different skill sets can cover the full range of either of their skills.

This is a hard one to achieve, but it will tend to develop over time if you’re documenting things well and doing code review. It’s also important to bear in mind while hiring.

Estimated cost: Hard to say because what you need to do to achieve it is so variable, but it will probably require you to hire more people than you otherwise would to get the redundant skill sets you need, so it’s not cheap.

Estimated benefit: Huge improvement in both total team and individual productivity.

Controversy level: Not exactly controversial, but tends not to happen in smaller companies due to people not seeing the benefit. Where it happens it tends to happen by accident.

Hire me to come take a look at what might be wrong

I do do software consulting after all. This isn’t what I normally consult on (I’m pretty focused on Hypothesis), but if you’d like the help I’d be happy to branch out, and after accidentally writing 5000 words on the subject I guess I clearly have a few things to say on the subject.

Drop me an email if you’re interested.

Estimated cost: My rates are very reasonable.

Estimated benefit: You’re probably better placed to answer this one than I am, but if this document sounds reasonable to you but you’re struggling to get it implemented or want some other things to try, probably quite high.

Controversy level: Not at all controversial. Everyone thinks this is a great idea and you should do it. Honest.

This entry was posted in programming on by .