Tag Archives: java

Minor revelation about Scala existential types

So, I’ve just realised why Scala’s existential types are several orders of magnitude more powerful than Java’s wildcards.

   def swap(xs : Array[T forSome { type T; }]) = xs(0) = x(1); 

The type is not completely unknown, and is persisted across method invocations, so for a given fixed instance you can make use of that to perform operations that would be impossible with wildcards. In particular the following can’t work:

  public void swap(List<?> xs){ 
    xs.set(0, xs.get(1));
  }

This can’t work, because Java has no way of determining that the two wildcards in xs.set and xs.get are the same.

This entry was posted in programming and tagged , , on by .

Why not Scala?

I thought I’d follow up on my previous post on why one would want to use Scala with one on why you wouldn’t. I’m definitely planning to continue using it, but it would be dishonest of me to pretend it was a perfect language.

I’m not going to cover the usual ones – weak tool support, difficulty of hiring Scala programmers, etc. These are pretty standard and will be true in most ‘esoteric’ languages you care to name. They’re certainly important, but not the point of this post. I’m just going to focus on language (and implementation) issues.

You’re looking for a functional language

Scala is not a functional programming language. It has pretensions of being so, and it has adequate support for functional programming, but it only goes so far. It’s got better support for functional programming than C#, Ruby, etc. but if you compare its functional aspects to ML, Haskell, OCaml, etc. you’ll find it sadly lacking. Problems include:

  • Its pattern matching is really rather cumbersome.
  • An annoying distinction between methods and functions. Scala’s first class functions are really no more than a small amount of syntactic sugar around its objects. Because Scala’s scoping is sane this isn’t particularly an issue, but it occasionally shows up.
  • The handling of multiple arguments is annoying. It doesn’t have the pleasant feature of Haskell or ML that every function has a single argument (multiple arguments are encoded as either tuples or via currying). Admittedly this isn’t a prerequisite of a functional language – e.g. Scheme doesn’t do it – but it’s a very big deal in terms of typing and adds a nice consistency to the language. I’m not aware of any statically typed functional languages which *don’t* do this (although the emphasis between tupling and currying varies from language to language).
  • Almost no tail call elimination worth mentioning. A very small subset of tail calls (basically self tail calls – the ones you can obviously turn into loops) are eliminated. This is more the JVM’s fault than Scala’s, but Martin Odersky himself has shown that you can do better (although admittedly it comes with a performance hit).
  • The type inference is embarrassingly weak. e.g. recursive methods won’t have their return type inferred. Even what type inference is there is less than reliable.

Compiler stability

The compiler is buggy. It’s not as buggy as I sometimes get the impression it is – I’ve definitely claimed a few things to be bugs which turned out to be me misunderstanding features – but it’s buggy enough that you’ll definitely run into issues. They’re rarely blockers (although sometimes they are. Jan Kristen has run into a few with his recent experiments with wicket + scala), but more importantly the bugginess means you really can’t trust the compiler as much as you’d like to. When something goes wrong it’s not always certain whether it’s your fault or the compiler’s. This is a big deal when one of the selling points is supposed to be a type system which helps you catch a wide class of errors.

Language consistency

The language has a lot of edge cases. These can be really difficult to wrap your head around, and can be really annoying to remember.

Let’s take an example. Variables. Simple, eh? Well, no.

A variable (local or field) can be a function (or constructor) parameter, a val, or a var. A val is a definition – it can’t be assigned to after the definition is made. A var is a normal mutable variable like in Java. A function parameter is almost like a val, except for the parts where it isn’t. Additionally, a function parameter can also be a var or a val. But it doesn’t have to be. Variables can be call by value (normal), call by name (the expression is evaluated each time you reference its value) or lazy (the expression is evaluated the first time you need its value and never again). But only vals can be lazy. And function parameters can’t be lazy, even if they’re also vals (I don’t understand this one. It seems obviously stupid to me). Meanwhile, only function parameters can be call by name – you can’t assign them to vars or vals (a no argument def is the equivalent of a call by name val).

Clear as mud, eh? Now, granted I wrote the above to make it sound deliberately confusing (it’s probably owed a blog post later to make it seem deceptively simple), but it’s a fairly accurate representation of the state of affairs.

Here’s another one (it’s related to the arguments issue). Consider the following snippet of code:

def foo = "Hello world";
println(foo());

def bar() = "Goodbye world";
println(bar);

Pop quiz: Does this code compile? If not, which bit breaks? No cheating and running it through the compiler!

Answer: No, it doesn’t. Because foo was defined without an argument list, it can’t be invoked as foo(). However, despite bar being defined with an (empty) argument list we can invoke it without one.

I could keep going, but I won’t. The short of it is that there are a lot of these little annoying edge cases. It seems to give beginners to the language a lot of grief.

Too much sugar

Scala has a lot of syntactic sugar. Too much in my opinion. There’s the apply/update sugar, unary operators by prefixing with unary_, general overloaded assignment (which, as I discovered when testing, only works in the presence of an associated def to go with it. Another edge case). Operators ending in : are left associative. Constructors are infixed in pattern matching case classes but not in application. etc. It’s hard to keep track of it all, and most of it is annoyingly superfluous.

Lack of libraries

Yes, yes, I know. It has all of the Java libraries to play with. And this is great. Except… well, they’re Java libraries. They’re designed with a Java mindset, and they can’t take advantage of Scala’s advanced features. Implicit conversions, and a number other tricks, are quite useful for making an API more palatable, but there’s a strong danger that what you end up with isn’t much more than Java with funny syntax. Much more than that requires a reasonable amount of porting work to get a good API for your use.

All in all, I find these add up to just a bunch of annoyances. It’s still my preferred language for the JVM, but depending on how you wait your priorities they might be more significant for you. Even for me I occasionally find myself getting *very* irritated with some of these.

This entry was posted in programming and tagged , , , , on by .

Variance of type parameters in Scala

This is just a quick introduction to one of the features of Scala’s generics. I realised earlier on IRC that they’re probably quite unfamiliar looking to people new to the language, so thought I’d do a quick writeup.

What does the following mean?

  trait Function1[-T1, +R]

It’s saying that the trait Function1 is contravariant in the parameter T1 and covariant in the parameter R.

Err. Eek. Scary words!

Lets try that again.

A Function1[Any, T] is safe to use as a Function1[String, T]. If I can apply f to anything I can certainly apply it to a String. This is contravariance. Similarly, a Function1[T, String] can be quite happily treated as a Function1[String, Any] – if it returns a String, it certainly returns an Any.

So, Foo[+T] means that if S <: T then Foo[S] <: Foo[T]. Foo[-T] means that if S <: T then Foo[T] <: Foo[S] (note the swapped the direction). The default Foo[T], called invariant, is that Foo[S] is not a subtype of Foo[T] unless S == T.

Examples of this sort of behaviour abound. Covariance is more common than contravariance, because immutable collections are almost always covariant in their type parameters. An immutable.List[String] can equally well be treated as an immutable.List[Any] – all the operations are concerned with what values you can get out of the list, so can easily be widened to some supertype.

However, a mutable.List is *not* covariant in its type parameter. You might be familiar with the problems that result from treating it as such from Java. Suppose I have a mutable.List[String], upcast it to a mutable.List[Any] and now do myList += 3. I’ve now added an integer to a list of Strings. Oops! For this reason, mutable objects tend to be invariant in their type parameters.

So, we have three types of type parameter: Covariant, contravariant, invariant. All three crop up and are quite useful.

But there are safe ways to treat mutable objects invariable. Suppose I want someone to pass me an array of Foos, and I have no intention of mutating it. It’s perfectly safe for them to pass me an array of Bars where Bar extends Foo. Can I do this?

Well, this can indeed be done. We could start by doing this:

  def doStuff[T <: Foo](arg : Array[T]) = stuff;

So we introduce a type parameter for the array. Because T will be inferred in most cases, this isn't too painful to use, but it can quickly cause the number of type parameters to explode (and you don't seem to be able to let some type parameters be inferred and some be explicitly provided). Further, we only care about the type parameter in one place. So, let's move it there.

  def doStuff(arg : Array[T forSome { type T <: Foo }]) = stuff;

This uses Scala's existential types to specify that there's an unknown type for which this holds true. This is effectively equivalent to the previous code, but narrows the scope of the type parameter.

The equivalent using Java style wildcards would be:

  def doStuff(arg : Array[? <: Foo ]) = stuff;

But this isn't legal Scala. This is unfortunately a case of Scala being more verbose than the Java equivalent. However, it's not all bad - because of the explicitly named type inside the forSome, you can express more complicated type relationships than wildcards allow for. For example the following:

  def doStuff(arg : Array[T forSome { type T <: Comparable[T]}]) = stuff;

And that's about it for variance in Scala. Hope you found it useful.

This entry was posted in programming and tagged , , on by .

Amusing discovery of the day

Java bytecode lets you overload method names based on the return type as well as the argument types, even if the language doesn’t.

This entry was posted in programming and tagged on by .

Type classes in Scala

Some backstory on this post: I got about halfway through writing it before realising that in fact it didn’t work because of a missing feature. I sent an email to the mailing list about this feature and after some discussion it was concluded that in fact this missing feature was a failure to meet the specification and that it had been fixed in 2.6.1. There’s still one thing lacking, but more on that later. The post now resumes.

I mentioned in a recent post that Scala could emulate Haskell type classes with its implicit defs and first class modules.

In actual fact, the situation is much happier than that. Implicit defs + first class modules give you significantly more than Haskell type classes (although with a tiny loss of type safety). At least, much more than Haskell 98. In particular, multiparameter type classes and associated types come for free. You also gain a number of other advantages, such as type class instantiation scopes lexically, so you can redefine type classes locally.

So, how does all this work? I’ll begin with a general introduction to this style of programming with no real reference to Haskell, and then I’ll tie this back in to encoding Haskell type classes at the end of it.

Here’s a class that’s familiar to anyone who has written non-trivial Java:

trait Comparator[T]{
  def compare(x : T, y : T) : Int; 
}

trait Comparable[T]{
  def compareTo(t : T);
}

Now, let’s define the following method:

def sort[T](array : Array[T])(implicit cmp : Comparator[T]) = stuff

So our sort method can either have a comparator passed to it explicitly or it will pull one from the surrounding environment. For example we could do:

object SortArgs{
  implicit val alphabetical = new Comparator[String]{
     def compare(x : String, y : String) = x.compareTo(y);
  }

  def main(args : Array[String]){
     println(sort(args));
  }
}

Will pick up the alphabetical instance.

It would be nice if we could also have defined the more general version:

object SortArgs{
  implicit def naturalOrder[T <: Comparable] = new Comparator[T]{
     def compare(x : T, y : T) = x.compareTo(y);
  }

  def main(args : Array[String]){
     println(sort(args));
  }
}

But this doesn't seem to work. :-/ Hopefully this will be fixed - it seems like wrong behaviour. Matt Hellige came up with the following workaround, but it's not very nice:

object Sorting{
    trait Comparator[T]{
        def compare(x : T, y : T) : Int;
    }

    def sort[T](arr : Array[T])()(implicit cmp : () => Comparator[T]) = null;
    implicit def naturalOrder[T <: Comparable]() : Comparator[T] = null;
    implicit def lexicographical[T]() (implicit cmp : () => Comparator[T]) :
        Comparator[List[T]] = null;

    def main(args : Array[String]){
        sort(args);
        sort(args.map(List(_)))
        sort(args.map(List(_)).map(List(_)))
    }
}

Moreover I haven't the faintest notion of why it works. :-)

However we can also chain implicit defs:

   implicit def lexicographicalOrder[T] (implicit cmp : Comparator[T]) : Comparator[List[T]] = stuff;

So now the following code *does* work:

  def main(args : Array[String]){
    sort(args);
    sort(args.map(List(_)))
    sort(args.map(List(_)).map(List(_)))
  }
}

An amazingly nice feature of this which some encodings of type classes miss is that you don't need an instance of a type to select on that type. Take for example the following:

object BinaryDemo{
  import java.io._;
  trait Binary[T]{
    def put(t : T, stream : OutputStream);
    def get(stream : InputStream) : T;
  }

  implicit val utf8 : Binary[String] = null;
  implicit def binaryOption[T] (implicit bin : Binary[T]) : Binary[Option[T]] = null;

  val myStream : InputStream = null;

  def readText(implicit bin : Binary[Option[String]]) : Option[String] = bin.get(myStream);

  readText match{
    case None => println("I found nothing. :(");
    case Some(x) => println("I found" + x);
  }
}

Unfortunately this example betrays a weakness in our encoding. I can't just randomly call "get" like in Haskell's Data.Binary - because I need to invoke it on an instance of Binary[T] I need to ensure at the method level that one is available. There doesn't appear to be a good way of getting access to implicits from the enclosing scope directly.

However, here's a silly hack:

   def fromScope[T] (implicit t : T) = t; 

And we get:

  fromScope[Binary[Option[String]].get(myStream) match{
    case None => println("I found nothing. :(");
    case Some(x) => println("I found" + x);
  }
}

This works just as well with multiple type parameters. For example, if we wanted to port Java's AtomicArray classes without wrapping everything (although admittedly wrapping everything would be more idiomatic Scala) we could do the following:


object ArrayDemo{
  import java.util.concurrent.atomic._;
  trait AtomicArray[S, T]{
    def get(s : S, i : Int) : T;
    def set(s : S, i : Int, t : T);
    def compareAndSet(s : S, i : Int, expected : T, update : T);
  }

 implicit val long = new AtomicArray[AtomicLongArray, Long]{
    def get(s : AtomicLongArray, i : Int) = s.get(i);
    def set(s : AtomicLongArray, i : Int, t : Long) = s.set(i, t);
    def compareAndSet(s : AtomicLongArray, i : Int, expected : Long, update : Long) = s.compareAndSet(i, expected, update);
  }
}

So, that's multi-parameter type classes. But one thing you'll notice about the above encoding is that it's a bloody nuisance to invoke - you need to know the type of the array class you're using, which is annoying. Far better would be if the trait took care of that. In Haskell terms this would be an associated type.

No problem.

object ArrayDemo{
  import java.util.concurrent.atomic._;
  trait AtomicArray[T]{
    type S;
    def get(s : S, i : Int) : T;
    def set(s : S, i : Int, t : T);
    def compareAndSet(s : S, i : Int, expected : T, update : T);
  }

 implicit val long = new AtomicArray[Long]{
    type S = AtomicLongArray;
    def get(s : AtomicLongArray, i : Int) = s.get(i);
    def set(s : AtomicLongArray, i : Int, t : Long) = s.set(i, t);
    def compareAndSet(s : AtomicLongArray, i : Int, expected : Long, update : Long) = s.compareAndSet(i, expected, update);
  }
}

Scala's classes can have abstract types. So we just encode associated types as those.

So, to recap on our encoding:

  class Foo a where
     bar :: a
     baz :: a -> a

becomes

  trait Foo[A]{
     def bar : A;
     def baz(a : A) : A;
  }

  instance Foo Bar where
  stuff

becomes


  implicit Foo[Bar] bar = new Foo[A]{
    stuff
  }


  instance (Foo a) => Foo [a]

becomes


  implicit def[T] (implicit foo : Foo[A]) : Foo[List[A]];

And for invoking:

  foo = bar

becomes


  val yuck = fromScope[Foo[A]].bar 

So it's a more verbose encoding, but not a terrible one. And it has some abstraction advantages too. For example:

   sortBy :: (a -> a -> Ord) -> [a] -> [a]
   sortBy = stuff;

   sort :: (Ord a) => [a] -> [a]

becomes

   def sort[A](xs : List[A])(implicit cmp : Comparator[A]);

Because our type classes are a form of implicit object passing, we can also use them with *explicit* object passing. Thus we can redefine behaviour much more nicely to behave equally well with an ordered type and explicitly provided comparison functions.

This has disadvantages too - you need to be more careful to ensure that you can't accidentally use two instances of the type class. This isn't a major burden though. The general solution is that when you have something which needs to maintain type class consistency between invocations you pass it an instance at construction type. Take for example Haskell's Data.Set. In Haskell getting a new Set (Set.empty) works for any type but almost all the functions for building sets have a constraint that the type belongs to Ord. In Scala you would require an Ord instance to be passed for Set construction but after that would not need one (analagous to Java's TreeSet providing a constructor that takes a Comparator).

One thing I haven't covered is type classes which abstract over type constructors rather than types. The reason I haven't covered them is that I've yet to peek into that corner of Scala's type system. However, I assume they work as Tony Morris has done some stuff with monads in Scala. Also, see this paper (which I've not read yet)

This entry was posted in programming and tagged , on by .