I like to think I’m pretty good at problem solving. How much this is really the case is up for debate, but either way I’ve learned a lot of useful little lessons in the course of testing it.
A really important one is this:
The problem you are trying to solve is often harder than the problem that you need to solve.
Problems frequently come with a set of requirements. It has to play games, let me do my email and make coffee. Well… does it really? Actually it’s much simpler to solve if you make your own damn coffee.
This is a slightly spurious example because the making coffee requirement is so obviously tacked on. But the general idea is common: Of the requirements your problem comes with, one or more may actually be non-essential and dropping it can simplify your life substantially.
Let’s take two real examples:
The first is at work, on the SONAR system. The objective is to build profiles for peoples’ knowledge based on their authored content. So what do we do?
- For each document, assign it some tags that describe it
- For each user, look at the tags of documents they’ve authored and give them a score for that tag based on that
This is an interesting example of an implicit requirement. We’ve broken the problem up into two steps, and each of those steps has thus become a requirement for the overall problem.
One of the first things I realised when I started working on SONAR is that this implicit requirement was making our life a lot harder than it needed to be. Automatic document tagging with an unknown set of tags is hard. There are lots of papers telling you how to do it, and none of them actually do a very good job (the best of them all tag in relation to some background corpus, but that’s not so good when there are e.g. a lot of internal project names to the company which aren’t present in that corpus).
Given that the end result on users is what matters, this realisation frees us from having to achieve a high precision per document and allows us to focus on recall (Translation: we don’t need to make sure all the results we get are good, we just need to make sure that the good results are contained within them). One minor rewrite later, our theme quality had gone up and our implementation complexity down.
My next example is much more controversial. Which is also a relevant point: Not everyone is going to agree on whether a given feature is essential. I still think I’m right on this one, but then I would. I’ve more or less resigned myself to the idea that the battle is lost here, so this is purely illustrative rather than an attempt to convince.
The example is the 2.8 Scala collections library. The signatures of map and flatMap are quite complicated:
def map[B, That](f: A => B)(implicit bf: BuilderFactory[B, That, Traversable[A]]): That def flatMap[B, That](f: A => Traversable[B])(implicit bf: BuilderFactory[B, That, Traversable[A]]): That
In comparison, an experimental collections API design I put together a while ago called alt.collections has substantially simpler signatures (equivalent to the old signatures in scala.collection really)
def map[S](f : T => S) : Traversable[S] def flatMap[S](f : T => Traversable[S]) : Traversable[S]
Why is my version so much simpler than Martin’s? Well, as much as I’d like to claim it’s because I’m an unparalleled genius of API design compared to Martin, that’s really not the case.
The reason for the comparative simplicity is that Martin’s version does something that mine does not.
The problem we’ve both set out to solve is that the return types of map and flatMap are very inconsistent in the current API: Sometimes they’re overridden to return the right type, sometimes they’re not (sometimes they can’t be). This upsets people. Essentially there is the following requirement that isn’t being met by the current collections API:
map and flatMap on a collection should return the same type of collection
Martin responds by meeting that requirement. I respond by taking it away. The design of alt.collections was centered around the idea that map and flatMap would not attempt to return the “right” type but would instead always return a Traversable. It then set about making it easy to get the right type out at the end.
More than one person has responded to this design with “I’m sorry, but that’s horrible” or similar, so clearly a lot of people think in this case the answer is “Yes, we are”. But I still find it a rather compelling example of how much simpler you can make the solution by first simplifying the problem.