Scala trivia of the day: Traits can extend classes

A lot of people seem to not know this. In particular 90% of use of self types I’ve seen appear to exist solely because people do not know this.

Observe the following interpreter session:

scala> class Foo;
defined class Foo

scala> class Bar;
defined class Bar

scala> trait Stuff extends Foo;
defined trait Stuff

scala> new Foo with Stuff;     
res0: Foo with Stuff = [email protected]

scala> new Bar with Stuff;
<console>:8: error: illegal inheritance; superclass Bar
 is not a subclass of the superclass Foo
 of the mixin trait Stuff
       new Bar with Stuff;
                    ^

You can basically view this as putting a constraint on the trait, saying that all classes that implement this trait must extend this superclass. This can be particularly useful for adding various sorts of behaviour to classes. e.g. traits which add behaviours to GUI components.

Thus ends our public service announcement.

This entry was posted in programming and tagged on by .

A workaround for misbehaving X citizens

Some programs (most notably amongst those I use firefox and pidgin) don’t work properly with shift-insert for pasting from the primary X Selection. If you’re like me and virtually live in the command line, this is a real pain in the ass. It became more of a pain in the ass recently because I’ve started using a graphics tablet instead of a mouse, so no longer have a middle click. I could no longer paste from IRC into firefox.

After some hacking around, I settled on the following solution. “xsel -o -p | xsel -i -b” dumps the primary X selection into the clipboard (xsel is a nice little application. You should probably have it installed). So I’ve bound this to a shortcut in xmonad and now can at the press of a magic key combination take the current selection and make it pasteable. Hurray.

This entry was posted in programming, Uncategorized and tagged on by .

From Objects to Agents

I’ve been at the first day of barcamp london 6 today. It’s been fantastic so far. Apparently I gave a talk today titled “From Objects to Agents”, which is curious as it’s not a subject I know anything about. Also, those attending the talk might have noticed I looked a bit different than usual. Anyway, at the request of several of those attending (myself included), I have put the slides online

If this post seems needlessly cryptic, you’re probably not in the target audience. Just nod and smile.

This entry was posted in programming on by .

Exceptions for control flow considered perfectly acceptable, thanks very much

Apparently the fact that I use exceptions in my suggested control flow mechanism for collections makes me a bad person. This idiocy touches on a sore point for me. I have been invited to write a post on “Why exceptions for control flow are a good idea”. So. Here it is.

The thing is, I really can’t be bothered. I can give and have given useful use cases, but I’ll just be told that they’re evil because I use exceptions. So instead I will argue the converse: Why they are not a bad idea.

Let’s consider some common objections:

Exceptions are slow

No they’re not. Exceptions in C++ (and I believe .NET) may be slow, but many implementations of exceptions are in fact extremely fast. The various ML implementations have taken advantage of this for quite some time. The JVM too (with some mild tricks) has very fast exceptions.

Exceptions are for exceptional situations

This argument is pervasive. It’s also meaningless. Depending on the intent of the one claiming it, it is one or more of three different forms of fallacy:

  1. Assertion of the conclusion
  2. Argument by name. They’re called exceptions, therefore they must only be for exceptional conditions. If I called them kittens would it suddenly be ok to use them for control flow? (Next post: Kittens for control flow considered harmful. I look forward to seeing Josh Suereth’s rebuttal).
  3. Argument by meaningless blither. What distinguishes “exceptional conditions” from “control flow”? I have reached the end of a million element list. This happens one time in a million! That sounds pretty exceptional to me!

Writing exception safe code is hard

No it’s not. Be sure to clean up anything that could need cleaning up in a finally block. If you write decent code, and you’re using a language with automatic memory management so that this doesn’t apply to Joe Random Object, these tend to be small and relatively well isolated cases. Where possible you abstract this out entirely by using patterns like automatic resource blocks (remember: In decent languages it’s just a library).

Further, even if it were true, this would be an argument against using exceptions at all rather than using them for control flow: If you aren’t sure where exceptions may be thrown from (which is particularly true if you’re writing generic library code), you have to write exception safe code.

It’s confusing

Everything’s confusing when you first encounter it. Then you get used to it, you find where it does and doesn’t work, and then it becomes non confusing. As long as you continue to maintain that a technique is evil and should not be used it will remain confusing.

Debuggers will break on exceptions

Decent debuggers allow you to specify which exceptions you break on and which you don’t. It’s easy to provide a marker class that says “This exception is for control flow. Don’t worry about it”.

You might accidentally catch the exceptions you’re using for control flow with too broad exception handling

Consider the break/continue example. What happens if I do the following (and break is implemented with exceptions):

     for(thing <- myThings){
        try {
           doStuff;
           if(dieNow) break;
        } catch { case (e : Exception) => e.printStackTrace }
     }

We accidentally catch the break! This is bad.

There are a couple aspects to my response to this:

First off, given sensible library design it doesn’t happen at all. Control flow exceptions get moved outside the hierarchy of normal exceptions which it’s sensible to catch (in my quick hacked together version they were Errors. Really they should be subclasses of ControlFlow which extends Throwable).

Secondly, this is a problem with too broad exception handling: You’ve handled a generic exception where you probably just wanted to handle a specific one. This is always known to be bad.

Thirdly, this is a really really easy problem to spot. As soon as you even lightly test the code you will see a stack trace for Break being printed and will realise what you’ve done wrong. This might cost you a good 10 seconds of confusion tops. (If you simply swallow the exception silently this isn’t true. But if you do that I have no sympathy for you at all and your code deserves to break).

The problems with using exceptions for control flow are well documented

Actually, they’re not. If you don’t believe me, check google scholar.

Edit: It has been pointed out to me that if you add quotes to the search term then you do get a few hits. They all seem to be random Java best practices documents deriving from Rules for developing robust programs with java exceptions, which does not appear to present any arguments not already covered here. For completeness, here is the relevant section from it:

IM1: Do not use exceptions for control flow.
It is possible to use the exception handling mechanism for control flow also during normal
program flow. This makes the code harder to read, harder to analyze, significantly slower and
opens up for some horrible situations if an actual exception were to occur. Exception should,
as the name states, only be used in exceptional situations.

So we’ve got one count of “confusing”, one count of “performance”, one of what I can only imagine is a vague allusion to the problem of accidentally catching the exception and a chaser of “Exceptions should only be used in exceptional circumstances”. This, as far as I can tell, is the gold standard argument for the opposition in the Java realm.

Feel free to post additional arguments for or against in the comment. I’ll expand the article with any good ones posted.

This entry was posted in programming on by .

Left folds with early termination

Consider what should be the basis of a general purpose collections API. In Scala it is Iterable, which uses an Iterator to define a foreach method. Most other things are defined in terms of this (note: This statement is untrue. Many of them use the iterator directly. But it is morally true).

This is all icky and imperative. Yuck.

As every good functional programmer knows, the true notion behind this silly “iteration” thing is actually folding.

The JVM is mean to us and doesn’t support lazy evaluation or variable sized stacks, so foldRights tend to be an exercise in pain: They stack overflow distressingly soon. So let’s base our collections API on foldLeft: Most designs of it are tail recursive (or iterative. Sigh). As well as dealing with JVM limitations this lets us develop a very high performance interface, and we can do pleasant things like dealing with very large lazy collections in constant space[1]

So:

trait Foldable[T]{
 def foldLeft[S](s : S)(f : (S, T) => S) : S;
}

There. That’s a nice basic design.

What methods can we add to this?

Well the obvious thing we can do to placate the Java programmers:

def foreach(f : T => Unit) = foldLeft[Unit](())((s, t) => f(t))

Now let’s define some stuff we care about. One handy method on Iterable is find. So let’s add that.

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => Some(s);
       case None if f(t) => Some(t);
       _ => None
  })

Nice, isn’t it?

So what happens if we do Stream.from(1).find(_ > 10).

It loops forever. :-(

Unfortunately our nice efficient foldLeft based implementation always eats the entire collection. This is sad: We can’t deal with infinite collections.It’s also really inefficient for large collections where we don’t need the whole thing.

So let’s add a way of stopping when we’re done:

def foldLeft[S](s : S)(f : (S, T) => Option[S]) : S;

The idea is that the Option is used to signal a stop condition. We return a None when we’re done consuming the collection, and the foldLeft stops.

Rewriting find in terms of this:

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => None;
       case None if f(t) => Some(Some(t));
       _ => Some(None)
  })

Ok. That’s a bit gross. But that’s purely a function of the fact that we were already trying to return an Option[T]. I’m sure for normal code it will look ok.

What happens is that once we’ve already found something, we now know we can stop, so we return a None to indicate stopping.

Hmm. It would be nice if we could incorporate some of this functionality into find itself. Let’s consider the following example:

(1 to 100000000 by 3).find(_ == 11)

We know this collection is ordered, so once we’ve gotten to 13 we know we’re never going to find 11. But find keeps on trucking till it hits the end of the collection. :-(

So, let’s modify find to support early termination as well. First we change its signature to f => Option[Boolean], and then…

No, sorry. My sense of good taste just revolted.

The problem is we now have to modify all signatures that might possibly want to support early termination of the collection. And they then have to explicitly deal with passing that on to foldLeft in the right way. All just to allow callers to pretend that the collection is smaller than it really is. It would be nice if we could write this functionality once and then have all the higher order functions based on foldLeft inherit it for free.

Agreed?

Well, there fortunately *is* a way to do that. It involves some loss of type safety, but as long as foldLeft is always properly implemented this should be acceptable.

case object Stop extends Throwable;
def stop : Nothing = throw Stop;

Implementations of foldLeft then have to catch Stop and treat that point as the end of the collection.[2]

Now, let’s implement find as follows:

def find(f : T => Boolean) : Option[T] =
   foldLeft[Option[T]](None)((s, t) =>
     s match {
       case Some(s) => stop;
       case None if f(t) => Some(t);
       _ => None
  })

Only a simple modification to our original implementation, but now it correctly consumes only a minimal amount of the collection (actually it consumes one element too many, but that’s because I’ve modified the implementation from the real one we’d use for pedagogical reasons).

And, for free, we can use stop in the argument to find as well:

(1 to 100000000 by 3).find{x => val c = x compare 11; if(c > 0) stop else (c == 0)}

And we no longer continue past the point where we need to, even though we don’t find our element. Yay!

So, hopefully I have persuaded you that left folds with an early termination forms a nice basis for a collection library: Clean, functional, and efficient. What more could you hope for?

Oh, before I finish. One more little bone to throw to our Java friends to make them feel more comfortable with this design:

def break = stop;

[1] See “Towards the best collections API” for a discussion of building an API around this design (note that their solution does involve a bit more than what we talk about here, some of which is linguistically difficult in Scala without continuations).

[2] You may be making squawking noises about performance now. Don’t. It’s fine. But now is not the time for discussion of this.

This entry was posted in Uncategorized on by .