Tag Archives: type systems

Existential types in Scala

With 2.7 of Scala on the way, people are being exposed to Java wildcards more and more, which translate to Scala existential types. Unfortunately no one seems to understand these (including me at first!) and had previously let them go largely ignored, and now everyone is getting confused.

Here’s a brief introduction.

scala> def foo(x : Array[Any]) = println(x.length);
foo: (Array[Any])Unit

scala> foo(Array[String]("foo", "bar", "baz"))
:6: error: type mismatch;
 found   : Array[String]
 required: Array[Any]
       foo(Array[String]("foo", "bar", "baz"))

This doesn’t compile, because an Array[String] is not an Array[Any]. You can put 1 into an Array[Any], but not into an Array[String]. Nonetheless, it’s completely typesafe – we’ve only used methods in foo which would work for any Array[T]. How do we fix this?

Here’s one way:

scala> def foo[T](x : Array[T]) = println(x.length)
foo: [T](Array[T])Unit

scala>> foo(Array[String]("foo", "bar", "baz"))
3

We’ve parameterised the method by T in order to make it accept any T. But now we have a superfluous type parameter on our method. This may not seem like a big deal, and it’s usually not, but it can add up if you’re not careful (and can be particularly annoying when for some reason the type checker is no longer able to infer a single one of your type parameters and you have to supply all of them). It’s also not really what we mean – we mean “I want an Array, and I don’t care what type of things it contains”

This is exactly what existential types are for.

scala> def foo(x : Array[T] forSome { type T}) = println(x.length)
foo: (Array[T] forSome { type T })Unit

scala> foo(Array[String]("foo", "bar", "baz"))
3

This is quite verbose, I know. There’s a shorthand, Array[_], but this has some unfortunate unintuitive behaviour. I’ll explain this later.

Sometimes we want to act on a more specific type, but don’t care exactly what type it is. For example suppose we wanted this to work on any CharSequence and do something more complicated to each argument. e.g.

scala> def foo(x : Array[T] forSome { type T <: CharSequence}) = x.foreach(y => println(y.length))
foo: (Array[T] forSome { type T <: java.lang.CharSequence })Unit

scala> foo(Array[String]("foo", "bar", "baz"))
3
3
3

The type arguments in an existential type can have upper and lower bounds like normal type declarations. They can't have view bounds, presumably due to technical limitations.

So we've seen how these are used, and that was relatively nonconfusing (I hope!). Let's pin down exactly what these mean.

Suppose we have a type constructor M. In other words, for a type T, M[T] is a type, but M is *not* itself a type. M could be List, Array, Class, etc. M[T] forSome { type T; } is the type of all things for which there is some T such that they are of type M[T]. So an Array[String] is of this type, because we can choose T = String, as is an Array[Int], etc.

If we add bounds, all we do is restrict the range that T can lie in. An Array[String] is not an Array[T] forSome { type T <: Number; } because the only possible choice of T (String) is not a subtype of Number

Now, all you need to do in order to understand a given existential type declaration is to apply this rule rigorously. But this can be hard, especially because precedence matters in subtle ways! I'll walk you through some examples.

T forSome { type T; }

This is the type of all things for which there exists some T such they are T. Wha?

Think about it for a second. It's the type of all things for which there exists a type such that they are of that type. i.e. it's a long winded way of writing the type of all things, Any. This is important, and it often trips people up when they write subtly the wrong thing. Considering the following two types:

Array[T] forSome { type T; }
Array[T forSome { type T; }]

They look almost identical, but they're in fact very different. The first is the type of all arrays, whatever their type parameter. The second is Array[Any]

Let's take another example, because this is the one which seems to come up a lot. We have a Map. We want it to map classes to something. Let's say Strings. What type do we use?

Here. Pick one:

Map[Class[T forSome { type T}], String]
Map[Class[T] forSome { type T}, String]
Map[Class[T], String] forSome { type T}

Which did you pick?

The correct answer is "Map[Class[T] forSome { type T}, String]", or to save you searching for ]s, "the middle one".

Why? Well, the first one is a Map[Class[Any], String]. Class is invariant in its type parameters. So the only Class[Any] is in fact classOf[Any] (this is basically the same as Object.class in Java). So that's not very useful. Similarily, the third one is the supertype of all map types such that there is some T such that they are a Map[Class[T], String]. So again, we've got some fixed class type for keys in the map - it's just that this time we don't know what type it is. The middle one however has keys of type Class[T] forSome { type T }. That is, its keys are classes which are allowed to have any value they want for their type parameter. So this is what we actually wanted.

Now, the final confusing point. As I mentioned, we have this shorthand use of _ for wildcards. So we can write Array[_] to mean Array[T] forSome { type T}. That's nice. So what happens if we try to use this in the above and write Map[Class[_], String]? It turns out, we get "Map[Class[T], String] forSome { type T}". The wildcards always bind to the outermost level of the type expression. This is, unfortunately, almost never what you want in cases where it affects anything. There's been some discussion about changing it. I don't know if it will go anywhere.

Anyway, hopefully this has made some sense of things for you. It's a really confusing subject when you first encounter it, but once you've got it straight in your head it's not too bad. It would be nice if it could be simpler, but I'm not really sure what the best way to do this actually would be.

This entry was posted in programming and tagged , , on by .

Minor revelation about Scala existential types

So, I’ve just realised why Scala’s existential types are several orders of magnitude more powerful than Java’s wildcards.

   def swap(xs : Array[T forSome { type T; }]) = xs(0) = x(1); 

The type is not completely unknown, and is persisted across method invocations, so for a given fixed instance you can make use of that to perform operations that would be impossible with wildcards. In particular the following can’t work:

  public void swap(List<?> xs){ 
    xs.set(0, xs.get(1));
  }

This can’t work, because Java has no way of determining that the two wildcards in xs.set and xs.get are the same.

This entry was posted in programming and tagged , , on by .

No, seriously, why Scala?

Recently an article called Why Scala? was posted on reddit. It’s an ok introduction to the language, but the very fair observation was made that it’s much more of a “What is Scala?” than a “Why Scala?”. I thought I’d share my thoughts on the subject. Mainly because I like hearing (reading) myself talk (write). :-)

Quick background: I initially learned to program in standard ML while at university (self taught, mostly, with the help of some friends doing computer science. I was doing maths). On graduating I then switched tracks entirely and started doing Java web development in a small software firm in London (I’ve since switched again, but I’m still doing Java professionally). I’ve also dabbled and read a lot with computer science and programming languages in my spare time since then, filling in the gaps that not having done any real computer science at university left me.

My train of thought on the language switch from ML to Java was basically:

a) Wow, this is different.
b) Ugh. Where are my higher order functions?
c) Where are all these random exceptions coming from?? I’ve compiled the code successfully, isn’t it supposed to work now?
d) Hmm. But there’s some useful stuff here too.

Scala’s a nice way to scratch both itches, and adds some very interesting features and functionality of its own. It’s not my favourite language (I don’t really have one. All languages suck. It’s just that some of them suck less in interesting ways), but it has a lot I like. Here’s a brain dump of some of it.

Things I like:

Object oriented programming

I know, it’s so entrenched it’s past even bothering with its buzzword status. But object oriented programming has a lot of advantages for medium to large scale composition. It has some disadvantages, and frankly sucks at small scale composition of functionality (which is where functional programming shines), but it allows for some very nice pluggability.

Module oriented programming

ML has higher order modules. I never really used them much when I was programming it more often (mostly because I was only writing enough code to do some simple maths projects. I never wrote anything large scale), but having looked into them in more details since they’re really powerful. They’re essentially a different take on the composition that object orientation provides. Where object orientation resolves everything dynamically, ML’s higher order modules resolve everything statically. This introduces some limitations in flexibility but makes up for them in power and type safety – they provide a much more flexible and interesting abstraction over a type than mere subclassing and interfaces can.

Scala has both. Further, it has both and lo and behold they are the same thing. Objects are modules, and can declare their own types (note: This is much more than just declaring an inner class in Java is), imported, etc. Modules are objects and can be instantiated at runtime, extended, etc. You lose a bit of the static guarantees that ML modules but you gain a lot of flexibility from both sides.

Static Typing

I’ve written too much Java to not like static typing.

Wait, I know that sounds like a non sequitur, but read on.

I’ve written too much Java and seen flagrantly stupid and really subtle runtime errors that should never have made it past the compiler coming out of it to not like static typing. NullPointerException, ClassCastException, argh.

If you’ve written enough code in a language like ML, OCaml or Haskell you will know that the compiler is your friend. And, like all good friends, it will yell at you if you do something stupid and then help you pick up the pieces.

Scala doesn’t quite manage that. If you write code in just the right way you can achieve that level of guarantee (and in some cases, more. But that tends to be the result of abuse of the type system by deranged maniacs), but the combination of subtyping and some Java interoperability decisions mean that it’s not quite as good. It’s not bad though.

So: I like object oriented programming, I like static typing. It logically follows that I must like statically typed object oriented languages, right? Well, in principle, yes. But Scala is the first one I’ve met with a type system that didn’t suck. Scala’s traits (a sort of mixin) are so much better to work with than interfaces, the generics work properly, provide variance annotations, etc. A reasonable subset of the types are inferred. Compared to the type systems of Java, C# and C++ it’s a dream (it’s not as nice as the type systems of the statically typed functional languages I know of. Subtyping seems to cause issues, with a lot of research still needed to make it work well, and Scala seems to have largely ignored what prior work there was Hindley-Milner style type systems with subtyping)

Functional programming

You’ve all been dreading this section. “Oh no. Now he’s going to enthuse about how marvelous functional programming is and how it’s going to cure cancer”. Nope. Can’t be bothered. Functional programming is nice. If you don’t believe that, I’m not going to try to convince you of it. Scala’s support for functional programming is ok. It has some warts, but it also has some nice points, and it generally works well and isn’t too verbose. I’m not going to get any more excited about its presence than I am about the fact that my bike has wheels (but I’d be pretty pissed off if my bike didn’t have wheels). Higher order functions, pattern matching, etc. It’s all there. It works. Moving on swiftly…

Implicits

Scala offers a bag of features under the keyword ‘implicit’. This is one of those things that makes you go “Oh, that’s cute” when you first see it and then go “Wow, that’s powerful” six months later.

Essentially implicits give you statically guaranteed and provided dynamic scoping. You say “I need a Foo. I don’t care where it comes from”, the compiler says “Here you go” or “Sorry, no Foos today”. These can be objects, implicit conversions between types (You know the way Ints get implicitly converted to longs, double, etc in Java? Scala does that too, but it’s all programmer definable. They’re just library functions in scala.Predefined). If you remember what I said about Scala objects being modules and you’ve read this paper a little light might just have gone on in your brain. If you haven’t read it and don’t want to, here’s the summary version: Implicit function arguments + first class modules gives you something that looks and quacks very much like Haskell type classes (yes, I know this isn’t actually what the paper says, but it follows from it). Mmm.

These are the big things to like about Scala. Here are a few little things:

  • Sane constructor/class semantics. If you’ve written a lot of Java there’s a good chance you hate its constructor system. Scala’s is much nicer.
  • Expression oriented code. Everything is an expression. You can form compound expressions trivially – { val foo = bar(); baz(foo, foo); } is an expression which evaluates to baz(foo, foo).
  • Sanely uniform scope. Pretty much anything you can do inside a method you can do inside an object and vice versa. Things are for the most part lexically scoped in the right way.
  • The primitive/object divide is much less irritating. Primitives get a few special treatments at the language level, but mostly they’re just objects. When things should compile to use primitives, they do. When the primitives need to be boxed, they will be. It’s almost entirely transparent.
  • Performance. Scala generates very good (well. ‘good’. Java-like) bytecode, which means it gets to take advantage of most of the optimizations the JVM is willing to throw its way. Further it puts a reasonable amount of its own work into performing optimisations on the bytecode, etc so you get those nice juicy abstractions without much overhead. There’s essentiall y no performance penalty for choosing Scala over Java

etc.

Scala’s far from perfect. It has some syntactic weirdnesses, a few issues carried over from Java, a moderately buggy compiler and a host of little features and edge cases that are really hard to keep in your head. However, I find that these issues don’t actually do more than annoy you from time to time. The core language is powerful and very useful for just sitting down and writing good code in.

Static vs. Dynamic Typing

Rah. Dynamic typing is evil. It lets you do totally unsafe things.

Err, no. Wait, that’s not right. Static typing is evil. It’s ugly and gets in the way and real programmers don’t make type errors anyway, do they? Are you saying you’re not a real programmer???

No, wait. Umm…

I’m so confused…

Want to know what I really think about static vs. dynamic typing? (no you don’t. I’m going to tell you anyway).

I wish the entire goddamn argument would go away.

Seriously, shoo. Buzz off.

There’s no reason there should be a one true language in which you do everything. None. Sometimes language Foo is better, sometimes language Bar is better. Given this, there’s no reason Foo can’t be dynamically typed and Bar statically typed.

Neither is better. Each have their strengths and weaknesses. People have their preferences, tasks have their requirements. Sometimes you’re better off choosing one or the other, sometimes you’re more comfortable choosing one or the other. When there’s a compelling reason to make a decision between the two (which is so much less often than the fanatics on both sides of the argument want you to think), follow that reason. When there’s not, use whichever you prefer.

But, in the mean time, shut the fuck about it. Far too many good and interesting discussions about languages turn into poo flinging between the type system monkeys. Shut up, grow up and get over it. There are more interesting things to talk about in the subject than whether certain operations should fail at run or compile time.

That is all.

This entry was posted in programming and tagged , on by .