Category Archives: Uncategorized

The best way to handle exceptions

So, it’s well known that you shouldn’t have code that looks like this (examples in ruby and ruby-like pseudo code, but they’re trivially translatable to any other language):

begin
   do stuff
rescue
end

i.e. you’re swallowing the exceptions because they scared you and you wanted to hide them. This is naughty.

A more subtle trap people fall into is the following example from the rails boot.rb (this isn’t a rails specific problem. I see it everywhere)

    def load_rails_gem
      if version = self.class.gem_version
        gem 'rails', version
      else
        gem 'rails'
      end 
    rescue Gem::LoadError => load_error
      $stderr.puts %(Missing the Rails #{version} gem. Please `gem install -v=#{version} rails`, update your RAILS_GEM_VERSION setting in config/environment.rb for the Rails version you do have installed, or comment out RAILS_GEM_VERSION
      exit 1
    end 

Let’s consider the results of this rescue block: A generic error message is printed, and we exit with a non-zero status code.

Now let’s consider the results of not having this rescue block: A specific error message is printed, we exit with a non-zero status code and we get a stack trace telling us exactly what went wrong.

So by including this rescue, we have lost information. Often this information doesn’t matter, but as it turns out in this case it does: If you have a gem version clash where rails depends on a different version of a gem that has already been loaded you will get a Gem::LoadError and, consequently, a very misleading error message.

I don’t want to pick on rails. Well, correction. I don’t want to pick on rails here. This post is actually inspired by a similar incident at work: There was a piece of code that basically looked like this:

begin
  connect to server
rescue
  STDERR.puts "Could not connect to server"
  exit
end

And were getting very puzzling errors where we were sure all the details were correct but it was failing to connect to the server. Once we deleted the rescue code it was immediately obvious why it was failing (if you care, the reason was that we weren’t loading the config correctly so it was trying to connect with some incorrect default values).

Which brings me to the point of this article: The best way to handle exceptions is not to handle them. If it’s not an exception you can reasonably recover from, the chances are pretty good that the default behaviour is more informative than the “helpful” code you were going to write in order to catch and log the error. So by not writing it you get to have less code and spend less time debugging it when it inevitably goes wrong.

This entry was posted in Uncategorized on by .

DataMapper is inherently broken

This contains overtones of a post I’ve been resisting writing for months in the knowledge that the people who already agree with it will find it obvious while the people who don’t will bitch and moan and tell me that I’m stupid and perpetuate the same tired contentless arguments they always use.

So, having explained why this post is useless, I now add the caveat that I am writing it anyway, for the very simple reason that I am so angry at being proven right on the subject time and time again, and in particular at having had an entire morning wasted by this fuckwittery, that I am overriding my internal censor.

DataMapper is a ruby ORM library. It is inherently broken. Why? Because it considers booleans to be a superior solution to exceptions. This means that there is essentially no way to save data without falling into subtle and pernicious traps.

In DataMapper when you save an object, it returns false if it fails to save. You then need to ask it why it failed to save if, for some bizarre reason, you feel the need to care about little things like your data not making it into the database.

So, the code you end up writing all over the place looks like the following:

   my_account = Account.new(:name => "Jose")
   if my_account.save
     # my_account is valid and has been saved
   else
     my_account.errors.each do |e|
       puts e
     end
   end

Before anyone claims this is a contrived and silly example, here’s the source of it.

What’s the problem here?

Well… earlier I was called to help some colleagues. They were seeing really weird behaviour where save was failing but errors was empty. I was sure this must be a bug somewhere – it shouldn’t be possible. We even checked before the safe and valid? was returning true.

Which reminds me, another and sometimes recommended way of writing the above is to instead do…

   my_account = Account.new(:name => "Jose")
   if my_account.valid?
     my_account.save
     # my_account is valid and has been saved
   else
     my_account.errors.each do |e|
       puts e
     end
   end

Warning: If you do this, you are screwed. Your code is broken, even if it happens to work right now, and you will be bitten by it. Change it.

So, why is this code wrong?

Because the errors do not necessarily live on the object you are trying to save.

Suppose I instead had:

my_account = Account.new(:customer => Customer.new(:name => "jose"))

and tried to save this.

Well, it might fail to save, but return no errors.

Why? Because, it first has to save the customer. And that might fail to save. And if it does you won’t see any errors on the object you’re saving and have to hunt around to figure out what’s going wrong.

So, as well as having to explicitly check for errors on everything you attempt to save in order for your code to be correct, you must also explicitly check for errors on anything IT might helpfully decide to save for you.

This is untenable.

Code which fails should fail noisily. I shouldn’t have to work to find out why it went wrong, it should TELL me. If DataMapper did the obvious thing and threw an exception when something failed to validate then it would cut out a great deal of work and debugging effort, it would mean that I could simply handle the creation of a bunch of data and deal with the errors in one place instead of having to check everything for failures. It would mean that one could write code without living in a constant state of paranoia that you might make a mistake and forget to check a return code somewhere and then waste hours debugging when it finally bites you. And it would mean that not all the examples in their documentation would be wrong.

This entry was posted in Uncategorized on by .

A workaround for misbehaving X citizens

Some programs (most notably amongst those I use firefox and pidgin) don’t work properly with shift-insert for pasting from the primary X Selection. If you’re like me and virtually live in the command line, this is a real pain in the ass. It became more of a pain in the ass recently because I’ve started using a graphics tablet instead of a mouse, so no longer have a middle click. I could no longer paste from IRC into firefox.

After some hacking around, I settled on the following solution. “xsel -o -p | xsel -i -b” dumps the primary X selection into the clipboard (xsel is a nice little application. You should probably have it installed). So I’ve bound this to a shortcut in xmonad and now can at the press of a magic key combination take the current selection and make it pasteable. Hurray.

This entry was posted in programming, Uncategorized and tagged on by .

Left folds with early termination

Consider what should be the basis of a general purpose collections API. In Scala it is Iterable, which uses an Iterator to define a foreach method. Most other things are defined in terms of this (note: This statement is untrue. Many of them use the iterator directly. But it is morally true).

This is all icky and imperative. Yuck.

As every good functional programmer knows, the true notion behind this silly “iteration” thing is actually folding.

The JVM is mean to us and doesn’t support lazy evaluation or variable sized stacks, so foldRights tend to be an exercise in pain: They stack overflow distressingly soon. So let’s base our collections API on foldLeft: Most designs of it are tail recursive (or iterative. Sigh). As well as dealing with JVM limitations this lets us develop a very high performance interface, and we can do pleasant things like dealing with very large lazy collections in constant space[1]

So:

trait Foldable[T]{
 def foldLeft[S](s : S)(f : (S, T) => S) : S;
}

There. That’s a nice basic design.

What methods can we add to this?

Well the obvious thing we can do to placate the Java programmers:

def foreach(f : T => Unit) = foldLeft[Unit](())((s, t) => f(t))

Now let’s define some stuff we care about. One handy method on Iterable is find. So let’s add that.

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => Some(s);
       case None if f(t) => Some(t);
       _ => None
  })

Nice, isn’t it?

So what happens if we do Stream.from(1).find(_ > 10).

It loops forever. :-(

Unfortunately our nice efficient foldLeft based implementation always eats the entire collection. This is sad: We can’t deal with infinite collections.It’s also really inefficient for large collections where we don’t need the whole thing.

So let’s add a way of stopping when we’re done:

def foldLeft[S](s : S)(f : (S, T) => Option[S]) : S;

The idea is that the Option is used to signal a stop condition. We return a None when we’re done consuming the collection, and the foldLeft stops.

Rewriting find in terms of this:

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => None;
       case None if f(t) => Some(Some(t));
       _ => Some(None)
  })

Ok. That’s a bit gross. But that’s purely a function of the fact that we were already trying to return an Option[T]. I’m sure for normal code it will look ok.

What happens is that once we’ve already found something, we now know we can stop, so we return a None to indicate stopping.

Hmm. It would be nice if we could incorporate some of this functionality into find itself. Let’s consider the following example:

(1 to 100000000 by 3).find(_ == 11)

We know this collection is ordered, so once we’ve gotten to 13 we know we’re never going to find 11. But find keeps on trucking till it hits the end of the collection. :-(

So, let’s modify find to support early termination as well. First we change its signature to f => Option[Boolean], and then…

No, sorry. My sense of good taste just revolted.

The problem is we now have to modify all signatures that might possibly want to support early termination of the collection. And they then have to explicitly deal with passing that on to foldLeft in the right way. All just to allow callers to pretend that the collection is smaller than it really is. It would be nice if we could write this functionality once and then have all the higher order functions based on foldLeft inherit it for free.

Agreed?

Well, there fortunately *is* a way to do that. It involves some loss of type safety, but as long as foldLeft is always properly implemented this should be acceptable.

case object Stop extends Throwable;
def stop : Nothing = throw Stop;

Implementations of foldLeft then have to catch Stop and treat that point as the end of the collection.[2]

Now, let’s implement find as follows:

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => stop;
       case None if f(t) => Some(t);
       _ => None
  })

Only a simple modification to our original implementation, but now it correctly consumes only a minimal amount of the collection (actually it consumes one element too many, but that’s because I’ve modified the implementation from the real one we’d use for pedagogical reasons).

And, for free, we can use stop in the argument to find as well:

(1 to 100000000 by 3).find{x => val c = x compare 11; if(c > 0) stop else (c == 0)}

And we no longer continue past the point where we need to, even though we don’t find our element. Yay!

So, hopefully I have persuaded you that left folds with an early termination forms a nice basis for a collection library: Clean, functional, and efficient. What more could you hope for?

Oh, before I finish. One more little bone to throw to our Java friends to make them feel more comfortable with this design:

def break = stop;

[1] See “Towards the best collections API” for a discussion of building an API around this design (note that their solution does involve a bit more than what we talk about here, some of which is linguistically difficult in Scala without continuations).

[2] You may be making squawking noises about performance now. Don’t. It’s fine. But now is not the time for discussion of this.

This entry was posted in Uncategorized on by .

Planet Scala gets its own domain

It can now be found at http://planetscala.com/

The old location and feeds will continue to work for the foreseeable future, but I’d appreciate it if people were to start using the new URL instead.

This entry was posted in Admin, programming, Science, Uncategorized and tagged on by .