Category Archives: rambling nonsense

How hard can it be?

There are two types of people in the world:

  1. People who assume jobs they haven’t done and don’t understand are hard
  2. People who assume jobs they haven’t done and don’t understand are easy
  3. People who try to divide the world into overly simple two item lists

Joking aside, there’s definitely a spectrum of attitudes in terms of how you regard jobs you don’t understand.

Developers seem very much to cluster around the “jobs I don’t understand are easy” end (“Not all developers” you say? Sure. Agreed. But it seems to be the dominant attitude, and as such it drives a lot of the discourse). It may be that this isn’t just developers but everyone. It seems especially prevalent amongst developers, but that may just be because I’m a developer so this is where I see it. At any rate, this is about the bits that I have observed directly, not the bits that I haven’t, and about the specific way it manifests amongst developers.

I think this manifests in several interesting ways. Here are two of the main ones:

Contempt for associated jobs

Have you noticed how a lot of devs regard ops as “beneath them”? I mean it just involves scripting a couple of things. How hard is it to write a bash script that rsyncs some files to a server and then restarts Apache?? (Note: If your deployment actually looks like this, sad face).

What seems to happen with devs and ops people is that the devs go “The bits where our jobs overlap are easy. The bits where our jobs do not I don’t understand, therefore they can’t be important”.

The thing about ops is that their job isn’t just writing the software that does deployment and similar. It’s asking questions like “Hey, so, this process that runs arbitrary code passed to it over the network…. could it maybe not do that? Also if it has to do that perhaps we shouldn’t be running it as root” (Lets just pretend this is a hypothetical example that none of us have ever seen in the wild).

The result is that when developers try and do ops, it’s by and large a disaster. Because they think that the bits of ops they don’t understand must be easy, they don’t understand that they are doing ops badly. 

The same happens with front-end development. Back-end developers will generally regard front-end as a trivial task that less intelligent people have to do. “Just make it look pretty while I do the real work”. The result is much the same as ops: It’s very obvious when a site was put together by a back-end developer.

I think to some degree the same happens with front-end developers and designers, but I don’t have much experience of that part of the pipeline so I won’t say anything further in that regard.

(Note: I am not able to do the job of an ops person or the job of a front-end person either. The difference is not that I know that their job is hard therefore I can do it. The difference is that I know that their job is hard so I don’t con myself into thinking that I can do it as well as they can. The solution is to ask for help, or at least if you don’t don’t pretend that you’ve done a good job).

Buzzword jobs

There seems to be a growing category of jobs that are basically defined by developers going “Job X: How hard can it be?” and creating a whole category out of doing that job like a developer. Sometimes this genuinely does achieve interesting things: Cross-fertilisation between domains is a genuinely useful thing that should happen more often.

But often when this happens it’s at the expense of the actual job the developers are trying to replace being done badly, and a lot of the things that were important about the job are lost.


  1. “Dev-ops engineer” – Ops: how hard can it be? (Note: There’s a lot of legit stuff that also gets described as dev-ops. That tends to be more under the heading of cross-fertilisation than devs doing ops. But a lot of the time dev-ops ends up as devs doing ops badly)
  2. “Data scientist” – Statistics: How hard can it be?
  3. “Growth hacker” – Marketing: How hard can it be? (actually I’m not sure this one is devs’ fault, but it seems to fit into the same sort of problem)

People are literally creating entire job categories out of the assumption that the people who already do those jobs don’t really know what they’re doing and aren’t worth learning from. This isn’t going to end well.


The main thing I want people to take from this is “This is a dick move. Don’t do it”. Although I’m sure there are plenty of jobs that are not actually all that hard, most jobs are done by people because they are hard enough that they need someone dedicated to doing them. Respect that.

If you really think that another profession could benefit from a developer insight because they’re doing things inefficiently and wouldn’t this be so much better with software then talk to them. Put in the effort to find out what their job involves. Talk to them about the problems they face. Offer them solutions to their actual problems and learn what’s important. It’s harder than just assuming you know better than them, but it has the advantage of being both the right thing to do and way less likely to result in a complete disaster.

This entry was posted in life, programming, rambling nonsense on by .

Different types of overheads in software projects

I mentioned this concept on Twitter in a conversation with Miles Sabin and Eiríkr Åsheim last night, and it occurred to me I’ve never written up this idea. I call it my quadratic theory of software projects.

It’s one I originally formulated in the context of program languages, but I’ve since decided that that’s over-simplistic and really it’s more about the whole project of development. It probably even applies perfectly well to things that are not software, but I’m going to be focusing on the software case.

The idea is this: Consider two properties. Call them “effort” and “achievement”, say. If I wanted to attach a more concrete meaning to those, we could say that “effort” is the number of person hours you’ve put into the project and “achievement” is the number of person hours it could have taken an optimal team behaving optimally to get to this point, but the exact meaning of them doesn’t matter – I only mention to give you an idea of what I’m thinking of with these terms.

The idea is this: If you plot on a graph, with achievement on the x axis and the amount of effort it took you to get there on the y axis, what you will get is roughly quadratic.

This isn’t actually true, because often there will be back-tracking – the “oh shit that feature is wrong” bit where you do a bunch of work and then realise it wasn’t necessary. But I’m going to count that as achievement too: You developed some stuff, and you learned something from it.

It’s also probably not empirically true. Empirically the graph is likely to be way more complicated, with bits where it goes surprisingly fast and bits where it goes surprisingly slow, but the quadratic is a useful thought tool for thinking about this problem.

Why a quadratic?

Well, a quadratic has three parts. We’ve got \(y = A + B x + C x^2\). In my model, \(A, B, C \geq 0\).

And in this context, each of those three parts have a specific meaning:

The constant component (A) is the overhead that you had to get started in the first place – planning, familiarising yourself with the toolchain, setting up servers, etc.

The linear factor (B) is how hard it is to actually make progress – for example, if you’re developing a web application in C++ there’s an awful lot of difficulty in performing basic operations, so this factor could be quite high. Other factors that might make it high are requiring a detailed planning phase for every line of code, requiring a 10:1 lines of test code to lines of application code, etc.

The quadratic factor (C) is the interesting one – constant and linear overhead are in some sense “obvious” features, but the quadratic part is something that people fail to take into account when planning. The quadratic overhead is how much you have to think about interactions with what you’ve already done. If every line of code in my code-base affects every other line in my code-base, then I have to deal with that: every line I write, I have to pay an overhead for every line I’ve already written. If on average a line of code interacts with only 10% of the other lines in the project, then I have to pay 10% of that cost, but it’s still linear in the size of the code base (note: I’m implicitly assuming here that lines of code is a linear function of achievement. I think in reality it’s going to be more complicated than that, but this whole thing is an oversimplification so I’m going to ignore that). When you have to pay a cost that is linear in your current progress, the result is that the total amount of cost you’ve paid by a given point is quadratic in your current progress (this is because calculus).

To use an overused word, the quadratic factor is essentially a function of the modularity of your work. In a highly modular code base where you can safely work on part of it without having any knowledge of most of the rest, the quadratic factor is likely to be very low (as long as these parts are well correlated with the bits you’re going to need to touch to make progress! If you’ve got a highly modular code base where in order to develop a simple feature you have to touch half the modules, you’re not winning).

There are also other things that can contribute to this quadratic factor. e.g. the amount that you have to take into account historical context: If a lot of the reasons why things are done is historical, then you have a linear amount of history you need to take into account to do new work. These all essentially work out as the same sort of thing though: The fraction of what you’ve already done you need to take into account in order to do new things.

So here’s the thing: Your approach to development of a project essentially determines these values. A lot of different aspects will influence them – who your team members are, what language you’re working in, how many of you there are, how you communicate, whether you’re doing pair programming, whether you’re doing test driven development, how you’re doing planning, etc and etc. Almost everything you could call “development methodology” factors in somehow.

And if you compare two development methodologies you’d find in active use for a given problem, it’s going to be pretty rare that one of them is going to do better (i.e. have a lower value) on all three of these coefficients: Some approaches might have a lower constant and linear overhead but a remarkably high quadratic overhead, some approaches might have a very high constant overhead but a rather low linear and quadratic, etc. Generally speaking, something which is worse on all three is so obviously worse that people will just stop doing it and move on.

So what you end up doing is picking a methodology based on which constants you think are important. This can be good or bad.

The good way to do this is to look at your project size and pick some constants that make sense for you. For small projects, the constant costs will dominate, for medium projects the linear costs will dominate, and large projects the quadratic costs will dominate. So if you know your project is just a quick experiment, it makes sense to pick something with low linear and constant costs and high quadratic costs, because you’re going to throw it away later (of course, if you don’t throw it away later you’re going to suffer for that). If you know your project is going to be last a while, it makes sense to front-load on the constant costs if you you can reduce the quadratic cost. In between, you can trade these off against eachother at different rates – maybe gain a bit on linear costs by increasing the quadratic cost slightly, etc.

The bad way to do it is to automatically discount some of these as unimportant. If you just ignore the quadratic cost as not a thing you will find that your projects get mysteriously bogged down once you hit a certain point. If you’re too impatient to pay constant costs and just leap in and start hacking, you may find that the people who sat down and thought about it for a couple hours first end up sailing past you. If you think that scalability and the long-term stability of the project is all that matters then people who decided that day to day productivity also mattered will probably be years ahead of you.

Sadly, I think the bad way to do it is by far the more common. I’ll grant it’s hard to predict the future of a project, and that the relationship between methodologies and these values can be highly opaque, which can make this trade-off hard to analyse, but I think things would go a lot better if people at least tried.

This entry was posted in programming, rambling nonsense on by .

It’s like Intrade meets OKCupid

As someone who spends most of their time single, it’s pretty much inevitable that every now and then I overcome my mild disinterest in the subject, decide it might be nice to try that not being single thing again for a change and reactivate one or more of my various online dating accounts.

This basically starts a cycle which lasts a few months and a dozen or so dates and ends up with me stopping using online dating again, frustrated at what an unpleasant experience it is. It definitely works for some people, but not really for me, and my frustrations with it definitely don’t seem to be unusual.

Naturally as someone who is interested in technical solutions to social problems and can’t resist a good problem when it’s sitting in front of me what then happens is that I design my own online dating site. I quickly remember that this isn’t an industry I want to get into and that the idea of bootstrapping it sounds hellish, but the result is that I have a lot of shelved ideas for how I’d do an online dating site if I took leave of my senses and decided to make one.

Which meant that when the latest wave of outrage about someone having done something terrible on the internet passed by, and when it was basically about online dating where you bribe people to go on dates with you, I was well primed.

My quip in response to it was “but of course this is a good idea. You see, markets are efficient, so everyone is going to end up with the best date”.

Unfortunately my second thought in response to that was “Wait hang on there’s something there”.

For those of you lucky enough not to have used OKCupid or similar, one of its key features is the compatibility score. It gives you a percentage score for how likely someone is to be a good match to you.

They’re a bit rubbish. I mean, a low score is a pretty good indicator that someone is going to be a dreadful match for you, but a high score at best indicates that you probably won’t hate them and everything they stand for (and there are plenty of people on the site you will hate everything they stand for).

One of the things I often wonder is why it doesn’t do complicated learning off the other info you’re providing to the site – you provide a lot of text which it could train classifiers off, why not learn what you’re looking for in a profile and use that? I suspect the short answer is because it’s hard. There’s a lot of nuance that it’s hard for a computer to extract from the profiles.

On the other hand, people are quite good at extracting that info from the profiles. Natural language processing is sortof our thing. If you could get people to provide you with compatibility scores that’d probably work rather well…

And markets are a decent way of soliciting information from people.

Of course, the idea of people bidding to go on dates with someone is revolting. Of all the things I don’t want online dating to do, reinforcing a commodity model of dating is pretty high up there. But with a twist, you can easily avoid this problem. You’re not bidding on going on dates with someone. You’re bidding on the chances of two other people going on a date.

i.e. you’re running a prediction market to get peoples’ compatibility scores.

I think I would describe this idea as “compellingly awful”. Amongst all the online dating sites I’m not going to build, I’m most not going to build this one. But it’s got into my head and I’ve accidentally designed the whole thing, so now I’m going to tell you about it to get the damn idea out of my head.

Here’s how it works: You use a fairly constrained prediction market based off a fake currency. Because the name amuses me I’m going to call this currency “dating dollars”.

Dating dollars can be bought for real money. They can never be redeemed for real money. Their worth is created by the fact that you must pay dating dollars to use features of the site. In particular it costs dating dollars to message someone for the first time (if they’ve already messaged you you get a discount but still have to pay some). You can also spend dating dollars for things like “promote me in search results” or “prioritise finding me some matches”.

As well as by spending real money you can get small amounts of dating dollars for providing feedback to the site. If someone messages you and you aren’t interested, clicking the “I’m sorry I’m not interested” button will give you a little bit of dating dollars. Once you’ve started talking to someone you can tell it if you’re going on a date or not. As long as your answer to that question agrees with the answer the other person gives you get rewarded. Also once you’ve been on a date it will ask you if that date went well. It will reward you either way for this one (so people don’t have an incentive to lie if they think the other person is mistaken). Basically the site is saying “Thanks!” for providing useful feedback it can use (and also it will nag you if you don’t provide that feedback).

And then there’s the third way, which is the actual point of the site: The prediction market.

This is a game you can play. What it does is it shows you two profiles side by side and asks you to the following question: With this payoff, would you pay one dating dollar to bet that if one of these two messages the other then they will go on a date and report that it went well?

If you say yes you pay the dollar and the bet is recorded. The following then happens:

  1. If at any point the two of them go on a date and both report it went well you get the pay-off
  2. If at any point they indicate that they’re not going to go on a date (e.g. by rejecting a message, by explicitly clicking “No we’re not going to go on a date”, etc) you lose the bet and are informed of such
  3. If after one week neither has shown any sign of interest or disinterest in the other, you get your dollar back and a dollar is deducted from the payoff. The bet remains active though, so you still have the potential to gain or lose money.

The profiles you are betting on are out of your control. You get shown them one after the other picked mostly at random (people can pay to appear more often in these). It probably makes sense to only vary one member of the pair at a time to reduce cognitive burden.

This game is then used to determine the “market rate” for a pair. This is 1 over the lowest value that most people will pay for the pair. This gives you your compatibility score. Peoples’ bet that a date between the two of you would be successful. You can then see your most compatible voters.

I suspect displaying the actual compatibility score would be… depressing. Assuming the prediction market is actually broadly accurate, most dates end in failure so most scores are going to be quite low. You’d almost certainly want to show the order + maybe a “great/good/eh/NOPE” badge based on where you lie in the global market for pairs.

You might need some protection against market manipulation. Hellbanning people who manipulate is probably a good way to do this (not necessarily from the site over all, but not counting their bets as contributing to the market price). You might also want to get people to predict only for pairs which are far away from them.

I think the main obstacle to this is market liquidity. On the one hand the fact that the seller is the computer and is powered by a fiat currency helps a lot, but you still need to get people to actually bid, and getting people to bid on enough pairs might be challenging. Or it might not – enough people might really enjoy the game of matchmaking that they’re happy to devote loads of time to it. I suspect work could be done to make this compelling. It might not be enough though – you probably need at least three offers for each pair, and there are order \(n^2\) pairs (actually it’s not that bad because localisation) and order \(n\) people.

You’d also have to watch the money supply – if you’re offering bids that are too generous then people will be flush with dating dollars and no one will pay you real money for them. If on the other hand you offer bids that are too stingy then no one will bid and play your game and your source of information will dry up. You could solve this with a bid and ask system, but I think the market is unlikely to be sufficiently liquid to support that. Though if it were you could support more complicated trading of instruments… e.g. predictions that at least one person from one group would go on a successful date with at least one person from another. Bundling assets together. etc. No. Argh. Bad David.

In short, I think that as well as being a ludicrous idea this is probably actually a bad one. The problems of keeping it balanced are probably too great. Which is good because I’m not going to be making it. Really.


This entry was posted in rambling nonsense on by .

Notes on instrumentalist reasoning

Warning: David is doing philosophy without proper training. Please be alert for signs of confusion and keep your hands inside the vehicle at all times.

I was having some interesting discussions with a Paul Crowley yesterday evening. He and I have a large enough overlap between our beliefs and interests that we can disagree vehemently about almost every philosophical issue ever.

The particular point of disagreement here was about epistemic vs instrumental reasoning. For the sake of this post I’m going to oversimplify these to mean “I want to believe things that are true” vs “I want to believe things that are useful”.

I’m very much in the instrumentalist camp.

Continue reading

This entry was posted in Open sourcing my brain, rambling nonsense on by .

How to quickly become effective when joining a new company

The other day my colleague Richard asked me how I managed to get started at Lumi quite so quickly. It’s a good question. When I started here I was pretty much doing useful work from my second day (I did manage to get “code” out into production on my first day, but it was my entry on the team page). I think I handed them a feature on day 3. For added funsies I’d only been writing Python for a month at that point.

I’ve seen enough new starters on equally or less complicated systems that I’m aware that this is… atypical. How did I do it?

The smartarse answer to the question is “obviously because I’m way smarter than those people”. As well being insufferably arrogant this isn’t a very useful answer – it doesn’t help anyone else, and it doesn’t help me (knowing how I do something is useful for me too because it allows me to get better at it and apply it to other things).

So I’m putting some thought into the question. This is largely just based off introspection and attempting to reverse engineer my brain’s behaviour. I’ve no idea how portable this advice is or even if it’s remotely accurate, but hopefully some of it might be useful.

There are essentially three key principles I apply, all about information:

  1. Acquire as little information as possible
  2. Acquire information as efficiently as possible
  3. Use the information you acquire as effectively as possible

I’ll elaborate on each of these and how to do them.

Acquire as little information as possible

This one is the only one that might be counter-intuitive. The others… well, yes. You want to acquire information efficiently and use it effectively. That’s kinda what learning quickly means. But acquiring as little information as possible? What’s that about? Surely when trying to learn stuff you want to acquire as much information as possible, right?


This sort of thinking fundamentally misunderstand the purpose of learning, at least in this context.

Learning is only useful when it allows you to get things done. Filling your brain with information is all very well, but what if you’ve filled your brain with entirely the wrong information?

Work example (a hypothetical one. I’ve not seen this happen, though I have seen similar): Suppose you diligently sit down, trace through the database schema, learning how everything fits together. You spend about a week doing this. Then you get told that that’s the old schema and we’re 2/3rds of the way through migrating to an entirely new data storage system after which that schema will be retired. Now the knowledge of the history isn’t entirely useless, but there’s certainly more useful things you could have been doing with that week.

The goal of learning as little information as possible forces you to do three important things:

  1. It forces you to learn the right information. If you only acquire information when it’s absolutely necessary then that information was by definition useful: It enabled you to do the necessary thing.
  2. It forces you to do things you can actually achieve soon. If task A would require you to learn a lot and task B would require you to learn a little, do task B. It will take you less time to learn the prerequisites, so you’ll get something achieved quickly.
  3. It prevents you from getting distracted by the immensity of the task of trying to understand everything and forces a more useful learning discipline on you by giving you concrete places to start.

“But David!”, I hear you cry in horror so profound that it causes you to use exclamation marks as commas, “Are you suggesting that learning is bad? That I shouldn’t try to learn how things work?”

Rest your mind and set aside your fears, for I am not!

The thing about information is that once you have acquired it you can build on it. Every little bit of knowledge you put into your head provides you with a framework on which to hang more knowledge. The journey of a thousand miles begins with a single step. Other profound sounding metaphors that convince you of the wisdom of my advice!

We’re essentially iterating this procedure over and over again. “What can I do? Oh, there’s a thing. Lets do that. Done. What can I do now? Oh, hey, there’s another thing which I couldn’t previously have done but now can because I’ve learned stuff from the last thing I did”. By acquiring our knowledge in little bite sized chunks we’ve rapidly consumed an entire information meal.

As well as interleaving getting things done with learning things, I think the quality of education you get this way is better than what you would otherwise get because it contains practical experience. You don’t just learn how things work, you learn how to work with them, and you learn a lot of concrete details that might have been missing from the high level description of them.

Now go back and read this section again (if you haven’t already. Don’t get stuck in an infinite loop and forget to eat and die. Because I’d feel really bad about that and it’s not an efficient way to learn at all). It’s literally the most important bit of this post and I really want you to take it on board.

Acquire information as efficiently as possible

Here is my knowledge acquisition procedure. I bequeath it to you. Use it well:

  1. Can I figure out how this works on my own? (5 minutes)
  2. Can I figure out how this works with googling and reading blog posts? (15 minutes)
  3. Ask someone to explain it to me (however long it takes)

OK, I know I said the previous section was the most important thing in this essay, but this one is pretty important too. For the love of all, please do not sit there ineffectual and frustrated because you can’t figure out what’s going on. It’s nice that you don’t want to bother your colleagues, and the first 20 minutes or so of trying to figure things out on your own is important as a way of preventing you from doing that too much, but your colleagues need you to be useful. They also probably possess the exact information you need. It is their responsibility and to their benefit to help you out when you get stuck.

Use the information you acquire as effectively as possible

This is a fuzzy one. It’s basically “Be good at thinking”. If there were a fool proof set of advice on how to do this the world would look very different than it currently does. Still, here are some ideas you might find useful.

Here is roughly what my thinking procedure looks like. Or, more accurately, here is my inaccurate model of what my thinking procedure looks like that I currently use to make predictions but don’t really expect corresponds that exactly to reality.

Err. That paragraph might be a bit of a clue as to what I’m about to say.

If I may get all meta for a moment: What is the purpose of knowledge acquisition?

I mean, sure, you want to get stuff done, but how does knowledge help you do that?

To me the value of knowledge is essentially predictive. You want to be able to answer two questions:

  1. If I do X, what will happen?
  2. If I want Y to happen, what should I do?

These two questions are natural inverses of each-other. In principle a valid algorithm for question 2 is “Think of all possible values of X and see if they cause Y to happen”. Everything else is just an optimisation to the search algorithm, right?

First I’ll go into how to answer question 1:

At its basic it’s two words. “Mental models”

A mental model is essentially an algorithm you can run in your brain that lets you put in actions as inputs and get outcomes as outputs.

They can be implicit or explicit (note: This is my terminology and is probably non-standard): An implicit mental model is essentially one that you can run without going through all the steps. You don’t need to be able to describe the model explicitly, you just know the answer. Essentially what most people mean when they say something is intuitive is “I have an implicit mental model for how this works”. You’re essentially pretty comfortable working with anything you have an implicit mental model for and can probably get things done with it pretty readily.

Explicit mental models are more work. You have to think through the steps linearly and laboriously in order to get the answer. But they’ve got a lot of benefits too:

  1. They tend to be more accurate. I find implicit mental models are full of shortcuts and heuristics which makes them really efficient when they work but often leaves some major gaps.
  2. They’re easy to fix when they’re wrong. And mental models will be wrong, all the time. The chances of you always being able to perfectly predict things are basically zero. An explicit mental model you can just patch, but adjusting your intuitions is much harder.
  3. You can explain your reasoning to other people. This is especially useful when you’re stuck and want to talk a problem through with someone else.

So both types of mental models have costs and benefits. So which do we choose?

Answer: Both!

Once you think of mental models in terms of being able to predict how things behave instead of truth about how things work you can realise that having multiple mental models of how something works is not only fine but often useful: You can cross check them against eachother to see if they make different predictions, you can use whichever is cheapest when the decision doesn’t matter very much or most accurate when it does.

Additionally, it’s not really as cut and dried as I’m making it seem. What tends to happen is that explicit mental models turn into implicit ones. Once you’re familiar enough with an explicit mental model the ideas become intuitive and internalized and you need to consult it less and less.

Here are some tips for building mental models:

  • Reuse old mental models as much as possible. When learning python I started out as “Like Ruby with funny syntax”. Then I learned a bit more and replaced this with “Like Javascript with funny syntax and a decent module system”. Then I found out about descriptors and patched it as “Like Javascript with funny syntax and a decent module system, but bear in mind that the value you put into an object might not be the value you get out of it because of these magic methods”.
  • In general, build mental models as cheaply as possible. A mental model that you can build quickly is probably much more useful than a laboriously constructed one, because you’ll be likely to resist changing the latter.
  • Patch your mental models when they make wrong predictions
  • Test your mental models constantly. This is useful a) Because it leads to finding out when you need to patch them and b) Because it greatly hastens the process of turning them into implicit mental models, which makes you more able to readily apply them in future.

That’s about all I’ve got on how to build mental models. Now onto question 2: I can predict the consequences of my actions. How do I go about predicting the actions that will lead to the consequences I want?

Here is roughly what I think of as my algorithm:

Section 1:

  1. Generate an idea
  2. Is it obviously stupid? Throw it away and go back to 1.
  3. Is it still not obviously stupid after a little bit more thought? Set it aside
  4. How many not-obviously-stupid ideas do have I set aside? 3 or so? If so, proceed to the next section.
  5. How long have I taken over this? Too long? If so and I have any ideas set aside, go to the next section. Else, step away from the problem. Either come at it from another angle of attack or go do something else and let it simmer away in the background.

Section 2:

I now have a bunch of idea candidates. Put some proper thought into it and attempt to determine which amongst them is best. This may require actually trying some things out and gathering new information.

Once I have determined which of these is the best, decide if I’m satisfied with it. If I’m not, start the whole process again. If I am, this is the solution I’m going to go with for now.

So basically the idea is that you want to use cheap mental models to filter ideas quickly, then once you’ve got a few goodish ideas to compete against eachother you can properly explore how they work as solutions. I don’t know how accurate a description this is, but I think it’s broadly OK.

Unfortunately it’s got a big gap right at the beginning where I say “Generate an idea”. How do you do that? Just try things at random?

Well, you just… do. I don’t really know. This may be where I have to fall back on my “Maybe I’m just kinda smart?” answer, but if so then sadly I’m not actually smart enough to give a good answer here. Here are some things I think that I might be doing, but it’s mostly speculation:

  1. Try and narrow the search space as much as possible. If you think you might be able to solve the problem by making changes to a specific thing, only look at ideas which change that thing. If that’s proving not to work, try a different thing. You should be able to use a mix of past experience and proximity to guide you here – think of similar problems you’ve solved in the past and look near what the solutions to those were. If you can’t think of any similar problems or those solutions don’t pan out, look nearby to the problem. The advantage of looking at a smaller area is that there are fewer things to try so you’re more likely to hit on the right one quickly.
  2. Try and narrow the problem by breaking it up into smaller pieces. When you do this it becomes much easier to narrow the search space as in the previous section. In order to do this try to find natural “cleave points” where the problem breaks neatly along specific lines.
  3. When you decide a solution is a bad idea, see if there are any general reasons why it’s a bad idea and try to avoid areas that relate to those reasons.
  4. Conversely, if an idea almost works but not quite, try exploring around it to see if there are any variations on it that work better.
  5. If you’re totally stuck, try thinking of something completely out there. It probably won’t work, but exploring why it won’t work might tell you interesting things about the problem and give you other areas that worked.

I don’t think that’s sufficient. I’m afraid this might just be a case of practice makes less imperfect. I think part of why I’m “smart” is that my brain just won’t shut up (seriously. It’s annoying), so tends to gnaw at problems like a dog with a bone until they give in, which has given me a lot of practice at trying to figure things out. Really, this might be the best advice I can give you, but you’ve probably heard it so many times that it’s redundant: Practicing solving problems and figuring things out makes you better at it. Still, I think there should be things you can do above and beyond that – practice is no good if you’re practicing doing something badly. I’d love to have a firmer idea of what they were.

Fitting it all together

I like to think of this as a process of exploration. You start somewhere simple (I find “actually using the product” to be a good place to start). You look around you and, step by step, you explore around you while doing whatever jobs you can find in the areas you currently have access to. Your colleagues will help you find your way, but ultimately it’s up to you to draw your own map of the territory and figure out how all the different bits connect up. Think of this post as my personal and eclectic guide to software cartography.

Does it work? Beats me. Certainly I seem to be doing something right, and I really hope that this encapsulates some of it. I think it’s mostly good advice, but there’s definitely a bit of a danger that I’m cargo culting my own brain here.

If you try any of this, do let me know how it goes, and please do share your own ideas in comments, email, etc. because I’d really like to know how to get better at this too.

This entry was posted in life, Open sourcing my brain, rambling nonsense on by .