David R. MacIver's Blog: How packages work in Scala

How packages work in Scala

16 July 2009

THIS PIECE IS FULL OF LIES DO NOT TRUST IT

More accurately, its information is out of date and no longer valid. This describes the old behaviour of the Scala package system. Its behaviour has been different from this for some years now, as it turned out most people weren’t reaching the “acceptance” stage I describe below and after enough shouting the behaviour got changed. This is preserved solely for posterity. Do not rely on it for accurate information.

Original piece follows:

Every now and then someone discovers how packages work in Scala. This process typically passes through a number of stages.

Confusion: “Hey, guys, I found this weird bug. Can you take a look?”
Surprise: “What? It works like that? Really?”
Denial: “No, I don’t believe you. This has to be a bug.”
Anger: “Dear scala-debate. This is the worst feature in the entire world, and if you don’t agree with me you’re a big poopy head”
Acceptance: “Actually, this is quite a neat feature”

Not everyone reaches step 5. Many stay in step 4 permanently, often because they’ve discovered that this interacts poorly with certain conventions they use.

This behaviour is particularly unfortunate because actually Scala’s package behaviour is quite nice. But people don’t seem to be willing to believe this and instead make up all sorts of behaviour which it doesn’t have and never has had and then get upset when the reality does not correspond to their fiction.

And so, in the hopes of dispelling some of this confusion, I bring to you the reality of how packages work in Scala. Some of this is very basic material, but I’m presenting it in case you’ve not explicitly thought about it in these terms as it will help with the leadup to the actually important part.

Identifiers

You have a bunch of identifiers in scope. These are names for things. It doesn’t matter what they’re names for: They could be vals, defs, packages, objects, etc. So for example suppose I have:

package foo;
object bar;
object baz{
   val kittens = "kittens";
}

within this file, say within the object bar, we’ve got a bunch of identifiers in scope: We have foo, the package we are in, bar, an object, and baz, another object. We don’t have kittens in scope (except within the object baz).

Within the object baz, everything in scope at the outer level is in scope here, but we’ve introduced the additional identifier kittens.

Note that a package conceptually constitutes one “level”. Everything from your current package is in scope, regardless of how you split it up into files - I could have moved some of the objects above into separate files and nothing would have changed.

Top level identifiers

Packages like foo are “top level” - they live in the global scope. Any file can refer to the identifier foo.

Nesting of packages

In the same way we had an object inside a package and introduced a new scope, we can nest a package inside a package.

package mammals;

package rodents{ 
   class Rat;
}

This places the package “rodents” inside the package “mammals”. In exactly the same way the object did, this inherits everything from the outer scope (and remember: the scope of the package is the scope of everything

package mammals;

class Cat;

package rodents{ 
   class Rat{
     def flee(moggy : Cat) = println("Help, help! Run away! It's " + moggy)
   }
}

the identifiers of the outer scope are available in the inner one.

But this sort of deeply nested package structure gets very ugly to write, so what one tends to do is seperate it out to one package in a given file, even the nested ones, and so there’s syntax to support it:

package mammals.rodents;

class Rat{
  def flee(moggy : Cat) = println("Help, help! Run away! It's " + moggy)
}

This is exactly the same as the previous example except we’ve moved Cat to another file. It’s still in scope as before.

Members

identifiers can have members. These are other identifiers which live on them and can be accessed with a .

For example, to refer to Rat from the package mammals we would refer to it as rodents.Rat.

Shadowing

You can reintroduce the same identifier at an inner level. Going back to our first example suppose we had written baz as

object baz{
   val bar = "kittens"
   val kittens = bar
}

Then kittens would still contain the string “kittens”, as it refers to the definition of bar in the current scope not the outside one. Outside of baz, bar would still refer to the object.

An important aspect of this: You can shadow packages just like anything else!

Suppose we have

package foo{
   object baz;
   package foo{
     object baz;

   object stuff{
     val it = foo.baz;
   }
  } 
}

Then “it” points to the innermost baz, not the outermost one: We’ve shadowed the definition of foo.

And this is where the problem lies.

Suppose I have

package net.liftweb{
   object AwesomeWebWidget{
      def doStuffWith(url : java.io.File) = ...
   }
}

and someone comes along (remember this doesn’t have to be in the same file - it can even be in a jar) and introduces

package net.java.kittens;

class Kitten;

Now the lift code will no longer work! The problem is that what we have actually looks like this:

package net{
   package java{
     package kittens{
       class Kitten;
     }
   }

   package liftweb{
      object AwesomeWebWidget{
         def doStuffWith(url : java.io.File) = ...
      }
   }
}

the problem is we have a different java identifier in scope than the one we wanted this to mean. It actually refers to the java identifier that we acquire from the net package, rather than the base java that lives in the root as desired. This is the problem that sparked the latest “discussion” in scala-debate on this subject.

The solutions

One thing which everyone immediately leaps to propose is to change the way imports work in Scala. Hopefully the above should have demonstrated that this wouldn’t help: I have not mentioned the word “import” anywhere in this explanation. So we can safely discard this as a non-solution.

The primary current solution is, unfortunately, a bit of an ugly one. When you want to say “the java at the root and I really damn mean it” you can refer to it as _root_.java.io.File. Adding this to your fully qualified names will force it to refer to the right one. Many people have taken to using _root_ on all their imports to fully qualify them. Personally I don’t feel the need (I don’t use Java reverse name conventions though, so I rarely run into the negative aspects of this behaviour).

Some people have taken to fully qualifying all their imports to prevent this sort of accidental shadowing. Personally I find this highly unnecessary. My preferred solution is to avoid the reverse domain name convention: Not having your top level package as something common greatly reduces the ability to accidentally have packages injected into your scope like this.

Other solutions are currently under discussion in scala-debate, so some of this may be prone to change

Comments

Miles Sabin on 2009-07-16 18:28:23:

All great stuff apart from your fixation on wanting to abandon Java’s reverse domain name convention ... which is completely impractical for anyone who isn’t prepared to leave the Java tools and libraries ecosystem (I know you disagree ... you’re wrong).

So for those of us who rule that out, that leaves the _root_ prefix as the only currently viable way option. That’s really very unpleasant, so let’s just hope that one of the more sensible alternatives that’s being discussed on scala-debate (like yours for example ;-) gets implemented in time for 2.8.0. Fortunately I think that’s quite likely.

david on 2009-07-16 18:31:26:

I hardly “fixated” on it. I mentioned it once, parenthetically, right at the end.

Further you’ve yet to provide a single example of where it actually harms Java interoperability to not use the reverse name convention. Repeated assertion is not the same as argument.

Jorge Ortiz on 2009-07-16 18:36:55:

Abandoning Java’s reverse domain name convention harms social interoperability with Java. It breaks with the expected social norms, albeit not the technical norms.

david on 2009-07-16 18:40:34:

So does using two space indents, not upper cased names for numeric types, having non alphanumeric method names and comparing objects with ==.

If you want to conform entirely to Java norms you’re more than welcome to write Java. In the meantime, I’ll happily carry on my way and not suffer from the bugs that people who do things for no good reason other than that it’s the Java way are forced to endure.

Miles Sabin on 2009-07-16 18:40:42:

Codebases which are migrating from Java to Scala by conversion on a class-by-class basis preserving the existing package structure will run into the scope capture problem.

This is an incredibly common case and one we don’t want to discourage in any way. It’s so common that I plan to support it via a “Convert compilation unit to Scala” refactoring in Eclipse if I can persuade paulp to pick up Scalify again.

I really don’t know what else to say ...

david on 2009-07-16 18:45:12:

Yes. That is indeed the primarily harmful use case.

Of course, given that almost everyone who has ever complained about this issue has done so because they’ve encountered it when adding or updating a dependency in an all Scala project, I think you’re overstating the problem here.

Further, note that this post contains no advocacy for the notion. I merely explained what was going on and that it is not a problem that I suffer from because I don’t follow a convention that induces it. So it appears to be you who is fixating on the idea, not me.

david on 2009-07-16 18:50:16:

By the way, the “certain conventions they use” part *wasn’t* a jab at the reverse domain name one specifically. There are other conventions which people use which fall afoul of the same issue. For example a semi-common one (or at least semi-common to try until you discover why it sucks) is to put your scala code in a “scala” subpackage of your project. Unfortunately shadowing kicks your ass when you try this.

ReadingAdept on 2009-07-16 19:17:13:

I fought with myself whether to write that in the scala debate but that debate was actually very much exiting(popcorn anyone?) and also informative/interesting.

some stand corrected , some slaughtered but in the end we did all win , didn’t we?

Arnold deVos on 2009-07-17 09:53:49:

Nice intro to the article! I went through five stages when I first started with scala. But the trigger in my case is one I have not seen mentioned in the discussion: I live in Australia. We have US-style domains suffixed .au. I write my stuff in packages beginning with au.com.… so imagine my puzzlement when I tried to import com.sun.….

Thomas on 2009-07-17 15:16:45:

What are the package names schemes you use? I ran into exactly this problem and while I think we can leave the java part in the DNS type approach, maybe my Scala part could benefit from a different naming convention. So would you care to explain your naming conventions?

david on 2009-07-17 15:31:06:

My naming conventions are basically as follows:

- Public projects simply get shortened to projectname as the top level package
- Private projects get shortened to companyname.projectname
- Packages structures are kept fairly shallow

It’s almost functionally equivalent to just dropping the TLD, so

com.foo.bar

becomes

foo.bar

Bhaskar Maddala on 2009-07-18 15:15:51:

It is perfectly acceptable for me to want to live long, actually really long. :)
package woodoo.charlaton{
object trickery {
private def doSomeMagic = java.lang.Math.random * 100
def liveLong = if(doSomeMagic == 1000) “You will live long” else “Die Now!!”
}

package java.lang{
object Math{
val random = 10
}
}
}
it occurs to me that most people are thinking of packages in scala as a means of code organization (coming from a Java world, this is the immediate model) rather than as nested structures and are as a result put off by what they might consider to be name shadowing.

However I am not certain that any of your proposed solutions on naming conventions

- Public projects simply get shortened to projectname as the top level package
- Private projects get shortened to companyname.projectname
- Packages structures are kept fairly shallow

resolves the issue. They just make it easier for programmers to use the feature without understanding it, but in my experience this results in more poorly written code in the long run.

If such conventions were to be used on a code base, I would add to the above proposed solution on naming conventions

- All packages must contain an object or class type

As an example
net.webbify.controller would become just controller dropping the TLD and projectname (if webbify has no object or type classes)

This enforces your third criterion of keeping package structures shallow.

You indicate that most people have taken to fully qualifying imports. This is only required in the cases where the type being referenced is from a third party source.

I like my packages as they are now

Gary on 2009-08-08 16:58:52:

The article starts by saying that people often don’t get to stage 5, acceptance, and that the package system is actually “quite nice.” But the article only goes on to decribe the problems of shadowing and package hijacking. What are the positives that make these risks worthwhile? What the practices enabled by the current scheme that are quite nice? Take us to stage 5!

ittay on 2009-10-01 06:04:47:

I agree with @Gary. As far as I could see, this feature means only something I can trip over and requires me to think more carefully about how I name packages. Where does it come in handy?

Also, why couldn’t packages be “reopened” instead of being always shadowed? That way, in the example, kittens would have been added to the existing ‘java’ namespace.

Dean Thompson on 2011-04-01 15:31:11:

Here’s the Scala 2.8 solution: http://www.artima.com/scalazine/articles/chained_package_clauses_in_scala.html

Best of drmaciver.com | David R. MacIver on 2013-08-26 09:53:32:

[…] How packages work in Scala. This is the old system. I still think it was a good system in itself, but unfortunately it was completely shafted by the combination of the Java conventions and the way package resolution from the file system worked. […]