How are Scala closures transformed to Java objects

2019-03-10 04:17发布

问题:

I'm currently looking at closure implementations in different languages. When it comes to Scala, however, I'm unable to find any documentation on how a closure is mapped to Java objects.

It is well documented that Scala functions are mapped to FunctionN objects. I assume that the reference to the free variable of the closure must be stored somewhere in that function object (as it is done in C++0x, e.g.).

I also tried compiling the following with scalac and then decompiling the class files with JD:

object ClosureExample extends Application { 
  def addN(n: Int) = (a: Int) => a + n
  var add5 = addN(5)
  println(add5(20))
}

In the decompiled sources, I see an anonymous subtype of Function1, which ought to be my closure. But the apply() method is empty, and the anonymous class has no fields (which could potentially store the closure variables). I suppose the decompiler didn't manage to get the interesting part out of the class files...

Now to the questions:

  • Do you know how the transformation is done exactly?
  • Do you know where it is documented?
  • Do you have another idea how I could solve the mystery?

回答1:

Let's take apart a set of examples so we can see how they differ. (If using RC1, compile with -no-specialization to keep things easier to understand.)

class Close {
  var n = 5
  def method(i: Int) = i+n
  def function = (i: Int) => i+5
  def closure = (i: Int) => i+n
  def mixed(m: Int) = (i: Int) => i+m
}

First, let's see what method does:

public int method(int);
  Code:
   0:   iload_1
   1:   aload_0
   2:   invokevirtual   #17; //Method n:()I
   5:   iadd
   6:   ireturn

Pretty straightforward. It's a method. Load the parameter, invoke the getter for n, add, return. Looks just like Java.

How about function? It doesn't actually close any data, but it is an anonymous function (called Close$$anonfun$function$1). If we ignore any specialization, the constructor and apply are of most interest:

public scala.Function1 function();
  Code:
   0:   new #34; //class Close$$anonfun$function$1
   3:   dup
   4:   aload_0
   5:   invokespecial   #35; //Method Close$$anonfun$function$1."<init>":(LClose;)V
   8:   areturn

public Close$$anonfun$function$1(Close);
  Code:
   0:   aload_0
   1:   invokespecial   #43; //Method scala/runtime/AbstractFunction1."<init>":()V
   4:   return

public final java.lang.Object apply(java.lang.Object);
  Code:
   0:   aload_0
   1:   aload_1
   2:   invokestatic    #26; //Method scala/runtime/BoxesRunTime.unboxToInt:(Ljava/lang/Object;)I
   5:   invokevirtual   #28; //Method apply:(I)I
   8:   invokestatic    #32; //Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
   11:  areturn

public final int apply(int);
  Code:
   0:   iload_1
   1:   iconst_5
   2:   iadd
   3:   ireturn

So, you load a "this" pointer and create a new object that takes the enclosing class as its argument. This is standard for any inner class, really. The function doesn't need to do anything with the outer class so it just calls the super's constructor. Then, when calling apply, you do the box/unbox tricks and then call the actual math--that is, just add 5.

But what if we use a closure of the variable inside Close? Setup is exactly the same, but now the constructor Close$$anonfun$closure$1 looks like this:

public Close$$anonfun$closure$1(Close);
  Code:
   0:   aload_1
   1:   ifnonnull   12
   4:   new #48; //class java/lang/NullPointerException
   7:   dup
   8:   invokespecial   #50; //Method java/lang/NullPointerException."<init>":()V
   11:  athrow
   12:  aload_0
   13:  aload_1
   14:  putfield    #18; //Field $outer:LClose;
   17:  aload_0
   18:  invokespecial   #53; //Method scala/runtime/AbstractFunction1."<init>":()V
   21:  return

That is, it checks to make sure that the input is non-null (i.e. the outer class is non-null) and saves it in a field. Now when it comes time to apply it, after the boxing/unboxing wrapper:

public final int apply(int);
  Code:
   0:   iload_1
   1:   aload_0
   2:   getfield    #18; //Field $outer:LClose;
   5:   invokevirtual   #24; //Method Close.n:()I
   8:   iadd
   9:   ireturn

you see that it uses that field to refer to the parent class, and invokes the getter for n. Add, return, done. So, closures are easy enough: the anonymous function constructor just saves the enclosing class in a private field.

Now, what about if we close not an internal variable, but a method argument? That's what Close$$anonfun$mixed$1 does. First, look at what the mixed method does:

public scala.Function1 mixed(int);
  Code:
   0:   new #39; //class Close$$anonfun$mixed$1
   3:   dup
   4:   aload_0
   5:   iload_1
   6:   invokespecial   #42; //Method Close$$anonfun$mixed$1."<init>":(LClose;I)V
   9:   areturn

It loads the parameter m before calling the constructor! So it's no surprise that the constructor looks like this:

public Close$$anonfun$mixed$1(Close, int);
  Code:
   0:   aload_0
   1:   iload_2
   2:   putfield    #18; //Field m$1:I
   5:   aload_0
   6:   invokespecial   #43; //Method scala/runtime/AbstractFunction1."<init>":()V
   9:   return

where that parameter is saved in a private field. No reference to the outer class is kept because we don't need it. And you ought not be surprised by apply either:

public final int apply(int);
  Code:
   0:   iload_1
   1:   aload_0
   2:   getfield    #18; //Field m$1:I
   5:   iadd
   6:   ireturn

Yes, we just load that stored field and do our math.

I'm not sure what you were doing to not see this with your example--objects are a little tricky because they have both MyObject and MyObject$ classes and the methods get split between the two in a way that may not be intuitive. But apply definitely applies things, and overall the whole system works pretty much the way you'd expect it to (after you sit down and think about it really hard for a really long time).



回答2:

Unlike Java's anonymous inner classes that are pseudo-closures and cannot modify the variables that appear to be closed into its their environment, Scala's closures are real, so the closure code directly references the values in the surrounding environment. Such values are compiled differently when they are referenced from a closure in order to make this possible (since there's no way for method code to access locals from any activation frames other than the current one).

In contrast, in Java their values are copied to fields in the inner class, which is why the language requires the original values in the enclosing environment to be final, so they can never diverge.

Because all the Scala function literal's / closure's references to values in the enclosing environment are in the code of the function literal's apply() method, they don't appear as fields in the actual Function subclass generated for the function literal.

I don't know how you're decompiling, but the details of how you did so probably explain why you're not seeing any code for the body of the apply() method.