Generic Variance in C# 4.0

2019-01-17 04:57发布

问题:

Generic Variance in C# 4.0 has been implemented in such a way that it's possible to write the following without an exception (which is what would happen in C# 3.0):

 List<int> intList = new List<int>();
 List<object> objectList = intList; 

[Example non-functional: See Jon Skeet's answer]

I recently attended a conference where Jon Skeet gave an excellent overview of Generic Variance, but I'm not sure I'm completely getting it - I understand the significance of the in and out key words when it comes to contra and co-variance, but I'm curious to what happens behind the scenes.

What does the CLR see when this code is executed? Is it implicitly converting the List<int> to List<object> or is it simply built in that we can now convert between derived types to parent types?

Out of interest, why wasn't this introduced in previous versions and what's the main benefit - ie real world usage?

More info on this post for Generic Variance (but question is extremely outdated, looking for real, up-to-date information)

回答1:

No, your example wouldn't work for three reasons:

  • Classes (such as List<T>) are invariant; only delegates and interfaces are variant
  • For variance to work, the interface has to only use the type parameter in one direction (in for contravariance, out for covariance)
  • Value types aren't supported as type arguments for variance - so there's no converstion from IEnumerable<int> to IEnumerable<object> for example

(The code fails to compile in both C# 3.0 and 4.0 - there's no exception.)

So this would work:

IEnumerable<string> strings = new List<string>();
IEnumerable<object> objects = strings;

The CLR just uses the reference, unchanged - no new objects are created. So if you called objects.GetType() you'd still get List<string>.

I believe it wasn't introduced earlier because the language designers still had to work out the details of how to expose it - it's been in the CLR since v2.

The benefits are the same as other times where you want to be able to use one type as another. To use the same example I used last Saturday, if you've got something implements IComparer<Shape> to compare shapes by area, it's crazy that you can't use that to sort a List<Circle> - if it can compare any two shapes, it can certainly compare any two circles. As of C# 4, there'd be a contravariant conversion from IComparer<Shape> to IComparer<Circle> so you could call circles.Sort(areaComparer).



回答2:

A few additional thoughts.

What does the CLR see when this code is executed

As Jon and others have correctly noted, we are not doing variance on classes, only interfaces and delegates. So in your example, the CLR sees nothing; that code doesn't compile. If you force it to compile by inserting enough casts, it crashes at runtime with a bad cast exception.

Now, it's still a reasonable question to ask how variance works behind the scenes when it does work. The answer is: the reason we are restricting this to reference type arguments that parameterize interface and delegate types is so that nothing happens behind the scenes. When you say

object x = "hello";

what happens behind the scenes is the reference to the string is stuck into the variable of type object without modification. The bits that make up a reference to a string are legal bits to be a reference to an object, so nothing needs to happen here. The CLR simply stops thinking of those bits as referring to a string and starts thinking of them as referring to an object.

When you say:

IEnumerator<string> e1 = whatever;
IEnumerator<object> e2 = e1;

Same thing. Nothing happens. The bits that make a ref to a string enumerator are the same as the bits that make a reference to an object enumerator. There is somewhat more magic that comes into play when you do a cast, say:

IEnumerator<string> e1 = whatever;
IEnumerator<object> e2 = (IEnumerator<object>)(object)e1;

Now the CLR must generate a check that e1 actually does implement that interface, and that check has to be smart about recognizing variance.

But the reason we can get away with variant interfaces being just no-op conversions is because regular assignment compatibility is that way. What are you going to use e2 for?

object z = e2.Current;

That returns bits that are a reference to a string. We've already established that those are compatible with object without change.

Why wasn't this introduced earlier? We had other features to do and a limited budget.

What's the principle benefit? That conversions from sequence of string to sequence of object "just work".



回答3:

Out of interest, why wasn't this introduced in previous versions

The first versions (1.x) of .NET didn't have generics at all, so generic variance was far off.

It should be noted that in all versions of .NET, there is array covariance. Unfortunately, it's unsafe covariance:

Apple[] apples = new [] { apple1, apple2 };
Fruit[] fruit = apples;
fruit[1] = new Orange(); // Oh snap! Runtime exception! Can't store an orange in an array of apples!

The co- and contra-variance in C# 4 is safe, and prevents this problem.

what's the main benefit - ie real world usage?

Many times in code, you are calling an API expects an amplified type of Base (e.g. IEnumerable<Base>) but all you've got is an amplified type of Derived (e.g. IEnumerable<Derived>).

In C# 2 and C# 3, you'd need to manually convert to IEnumerable<Base>, even though it should "just work". Co- and contra-variance makes it "just work".

p.s. Totally sucks that Skeet's answer is eating all my rep points. Damn you, Skeet! :-) Looks like he's answered this before, though.