Scala: arrays and type erasure

2020-08-14 10:10发布

问题:

I'd like to write overloaded functions as follows:

case class A[T](t: T)
def f[T](t: T) = println("normal type")
def f[T](a: A[T]) = println("A type")

And the result is as I expected:

f(5)       => normal type
f(A(5))  => A type

So far so good. But the problem is the same thing doesn't work for Arrays:

def f[T](t: T) = println("normal type")
def f[T](a: Array[T]) = println("Array type")

Now the compiler complains:

double definition: method f:[T](t: Array[T])Unit and method f:[T](t: T)Unit at line 14 have same type after erasure: (t: java.lang.Object)Unit

I think the signature of the second function after type erasure should be (a: Array[Object])Unit not (t: Object)Unit, so they shouldn't collide with each other. What am I missing here?

And if I'm doing something wrong, what would be the right way to write f's so that the right one will get called according to the type of the argument?

回答1:

This is never an issue in Java, because it does not support primitive types in generics. Thus, following code is pretty legal in Java:

public static <T> void f(T t){out.println("normal type");}
public static <T> void f(T[] a){out.println("Array type");}

On the other hand, Scala supports generics for all types. Although Scala language does not have primitives, the resulting bytecode uses them for types like Int, Float, Char and Boolean. It makes the difference between the Java code and Scala code. The Java code does not accept int[] as an array, because int is not an java.lang.Object. So Java can erase these method parameter types to Object and Object[]. (That means Ljava/lang/Object; and [Ljava/lang/Object; on JVM.)

On the other hand, your Scala code handles all arrays, including Array[Int], Array[Float], Array[Char], Array[Boolean] and so on. These arrays are (or can be) arrays of primitive types. They can't be casted to Array[Object] or Array[anything else] on the JVM level. There is exactly one supertype of Array[Int] and Array[Char]: it is java.lang.Object. It is more general supertype that you may wish to have.

To support these statements, I've written a code with less generic method f:

def f[T](t: T) = println("normal type")
def f[T <: AnyRef](a: Array[T]) = println("Array type")

This variant works like the Java code. That means, array of primitives aren't supported. But this small change is enough to get it compiled. On the other hand, following code can't be compiled for the type erasure reason:

def f[T](t: T) = println("normal type")
def f[T <: AnyVal](a: Array[T]) = println("Array type")

Adding @specialized does not solve the problem, because a generic method is generated:

def f[T](t: T) = println("normal type")
def f[@specialized T <: AnyVal](a: Array[T]) = println("Array type")

I hope that @specialized might have solved the problem (in some cases), but compiler does not support it at the moment. But I don't think that it would be a high priority enhancement of scalac.



回答2:

I think the signature of the second function after type erasure should be (a: Array[Object])Unit not (t: Object)Unit, so they shouldn't collide with each other. What am I missing here?

Erasure precisely means that you lose any information about the type parameters of a generic class, and get only the raw type. So the signature of def f[T](a: Array[T]) cannot be def f[T](a: Array[Object]) because you still have a type parameter (Object). As a rule of thumb you just need to drop the type parameters to get the erase type, which would give us def f[T](a: Array). This would work for all other generic classes, but arrays are special on the JVM, and in particular their erasure is simply Object (ther is no array raw type). And thus the signature of f after erasure is indeed def f[T](a: Object). [Updated, I was wrong] Actually after checking the java spec, it appears that I was completly wrong here. The spec says

The erasure of an array type T[] is |T|[]

Where |T| is the erasure of T. So, indeed arrays are treated specially, but the peculiar thing is that the while the type parameters are indeed removed, the type is marked as being an array of T instead of just T. This means that Array[Int] is, after erasure still Array[Int]. But Array[T] is different: T is a type parameter for the generic method f. In order to be able to treat any kind of array generically, scala has no other choice than turning Array[T] into Object (and I suppose Java does just the same by the way). This is because as I said above there is no such thing as a raw type Array, so it has to be Object.

I'll try to put it another way. Normally when compiling a generic method with a parameter of type MyGenericClass[T], the mere fact that the erased type is MyGenericClass makes it possible (at the JVM level) to pass any instantiation of MyGenericClass, such as MyGenericClass[Int] and MyGenericClass[Float], because they are actually all the same at runtime. However, this is not true for arrays: Array[Int] is a completly unrelated type to Array[Float], and they won't erase to a common Array raw type. Their least common type is Object, and so this is what is manipulated under the hood when arrays are treated generically (everythime the compiler cannot know statically the type of elements).

UPDATE 2: v6ak's answer added a useful bit of information: Java does not support primitive types in generics. So in Array[T], T is necessarily (in Java, but not in Scala) a sub-class of Object and thus its erasure to Array[Object]totally makes sense, unlike in Scala where T can by example be the primitive type Int, which is definitly not a sublclass of Object (aka AnyRef). To be in the same situation as Java, we can constrain T with an upper bound, and sure enough, now it compiles fine:

def f[T](t: T) = println("normal type")
def f[T<:AnyRef](a: Array[T]) = println("Array type") // no conflict anymore

As to how you can work around the problem, a common solution is to add a dummy parameter. Because you certainly don't want to explicitly pass a dummy value on each call, you can either give it a dummy default value, or use an implicit parameter that will always be implicitly found by the compiler (such as dummyImplicit found in Predef):

def f[T](a: Array[T], dummy: Int = 0)
// or:
def f[T](a: Array[T])(implicit dummy: DummyImplicit)
// or:
def f[T:ClassManifest](a: Array[T])


回答3:

[Scala 2.9] A solution is to use implicit arguments which naturally modify the signature of the methods such that they don't conflict.

case class A()

def f[T](t: T) = println("normal type")
def f[T : Manifest](a: Array[T]) = println("Array type")

f(A())        // normal type
f(Array(A())) // Array type

T : Manifest is syntactic sugar for a second argument list (implicit mf: Manifest[T]).

Unfortunately, I don't know why Array[T] would be erased to just Object instead of Array[Object].



回答4:

To get over type erasure in scala you can add an implicit parameter that will give you the Manifest (scala 2.9.*) or TypeTag (scala 2.10) and then you can get all the info you want regarding the types as in:

def f[T](t: T)(implicit manifest: Manifest[T])

You can the check if m is instance of Array etc.