I'm currently looking at closure implementations in different languages. When it comes to Scala, however, I'm unable to find any documentation on how a closure is mapped to Java objects.
It is well documented that Scala functions are mapped to FunctionN objects. I assume that the reference to the free variable of the closure must be stored somewhere in that function object (as it is done in C++0x, e.g.).
I also tried compiling the following with scalac and then decompiling the class files with JD:
object ClosureExample extends Application {
def addN(n: Int) = (a: Int) => a + n
var add5 = addN(5)
println(add5(20))
}
In the decompiled sources, I see an anonymous subtype of Function1, which ought to be my closure. But the apply() method is empty, and the anonymous class has no fields (which could potentially store the closure variables). I suppose the decompiler didn't manage to get the interesting part out of the class files...
Now to the questions:
- Do you know how the transformation is done exactly?
- Do you know where it is documented?
- Do you have another idea how I could solve the mystery?
Let's take apart a set of examples so we can see how they differ. (If using RC1, compile with -no-specialization
to keep things easier to understand.)
class Close {
var n = 5
def method(i: Int) = i+n
def function = (i: Int) => i+5
def closure = (i: Int) => i+n
def mixed(m: Int) = (i: Int) => i+m
}
First, let's see what method
does:
public int method(int);
Code:
0: iload_1
1: aload_0
2: invokevirtual #17; //Method n:()I
5: iadd
6: ireturn
Pretty straightforward. It's a method. Load the parameter, invoke the getter for n
, add, return. Looks just like Java.
How about function
? It doesn't actually close any data, but it is an anonymous function (called Close$$anonfun$function$1
). If we ignore any specialization, the constructor and apply are of most interest:
public scala.Function1 function();
Code:
0: new #34; //class Close$$anonfun$function$1
3: dup
4: aload_0
5: invokespecial #35; //Method Close$$anonfun$function$1."<init>":(LClose;)V
8: areturn
public Close$$anonfun$function$1(Close);
Code:
0: aload_0
1: invokespecial #43; //Method scala/runtime/AbstractFunction1."<init>":()V
4: return
public final java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: invokestatic #26; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
5: invokevirtual #28; //Method apply:(I)I
8: invokestatic #32; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
11: areturn
public final int apply(int);
Code:
0: iload_1
1: iconst_5
2: iadd
3: ireturn
So, you load a "this" pointer and create a new object that takes the enclosing class as its argument. This is standard for any inner class, really. The function doesn't need to do anything with the outer class so it just calls the super's constructor. Then, when calling apply, you do the box/unbox tricks and then call the actual math--that is, just add 5.
But what if we use a closure of the variable inside Close? Setup is exactly the same, but now the constructor Close$$anonfun$closure$1
looks like this:
public Close$$anonfun$closure$1(Close);
Code:
0: aload_1
1: ifnonnull 12
4: new #48; //class java/lang/NullPointerException
7: dup
8: invokespecial #50; //Method java/lang/NullPointerException."<init>":()V
11: athrow
12: aload_0
13: aload_1
14: putfield #18; //Field $outer:LClose;
17: aload_0
18: invokespecial #53; //Method scala/runtime/AbstractFunction1."<init>":()V
21: return
That is, it checks to make sure that the input is non-null (i.e. the outer class is non-null) and saves it in a field. Now when it comes time to apply it, after the boxing/unboxing wrapper:
public final int apply(int);
Code:
0: iload_1
1: aload_0
2: getfield #18; //Field $outer:LClose;
5: invokevirtual #24; //Method Close.n:()I
8: iadd
9: ireturn
you see that it uses that field to refer to the parent class, and invokes the getter for n
. Add, return, done. So, closures are easy enough: the anonymous function constructor just saves the enclosing class in a private field.
Now, what about if we close not an internal variable, but a method argument? That's what Close$$anonfun$mixed$1
does. First, look at what the mixed
method does:
public scala.Function1 mixed(int);
Code:
0: new #39; //class Close$$anonfun$mixed$1
3: dup
4: aload_0
5: iload_1
6: invokespecial #42; //Method Close$$anonfun$mixed$1."<init>":(LClose;I)V
9: areturn
It loads the parameter m
before calling the constructor! So it's no surprise that the constructor looks like this:
public Close$$anonfun$mixed$1(Close, int);
Code:
0: aload_0
1: iload_2
2: putfield #18; //Field m$1:I
5: aload_0
6: invokespecial #43; //Method scala/runtime/AbstractFunction1."<init>":()V
9: return
where that parameter is saved in a private field. No reference to the outer class is kept because we don't need it. And you ought not be surprised by apply either:
public final int apply(int);
Code:
0: iload_1
1: aload_0
2: getfield #18; //Field m$1:I
5: iadd
6: ireturn
Yes, we just load that stored field and do our math.
I'm not sure what you were doing to not see this with your example--objects are a little tricky because they have both MyObject
and MyObject$
classes and the methods get split between the two in a way that may not be intuitive. But apply definitely applies things, and overall the whole system works pretty much the way you'd expect it to (after you sit down and think about it really hard for a really long time).
Unlike Java's anonymous inner classes that are pseudo-closures and cannot modify the variables that appear to be closed into its their environment, Scala's closures are real, so the closure code directly references the values in the surrounding environment. Such values are compiled differently when they are referenced from a closure in order to make this possible (since there's no way for method code to access locals from any activation frames other than the current one).
In contrast, in Java their values are copied to fields in the inner class, which is why the language requires the original values in the enclosing environment to be final
, so they can never diverge.
Because all the Scala function literal's / closure's references to values in the enclosing environment are in the code of the function literal's apply()
method, they don't appear as fields in the actual Function
subclass generated for the function literal.
I don't know how you're decompiling, but the details of how you did so probably explain why you're not seeing any code for the body of the apply()
method.