As I'm not exactly an expert on programming languages I'm well aware this may be a stupid question, but as best as I can tell C# handles anonymous methods and closures by making them into instance methods of an anonymous nested class [1], instantiating this class, and then pointing delegates at those instance methods.
It appears that this anonymous class can only ever be instantiated once (or am I wrong about that?), so why not have the anonymous class be static instead?
[1] Actually, it looks like there's one class for closures and one for anonymous methods that don't capture any variables, which I don't entirely understand the rationale for either.
It's not.
C# does that sometimes.
In cases where that would be legal, C# does you one better. It doesn't make a closure class at all. It makes the anonymous function a static function of the current class.
And yes you are wrong about that. In cases where you can get away with only allocating the delegate once, C# does get away with it.
(This is not strictly speaking entirely true; there are some obscure cases where this optimization is not implemented. But for the most part it is.)
You have put your finger on the thing you don't adequately understand.
Let's look at some examples:
This can be generated as
No new class needed.
Now consider:
Do you see why this cannot be generated with a static function? The static function would need access to this.x but where is the this in a static function? There isn't one.
So this one has to be an instance function:
Also, we can no longer cache the delegate in a static field; do you see why?
Exercise: could the delegate be cached in an instance field? If no, then what prevents this from being legal? If yes, what are some arguments against implementing this "optimization"?
Now consider:
This cannot be generated as an instance function of C3; do you see why? We need to be able to say:
Now the delegates need to know not just the value of
this.x
but also the value ofy
that was passed in. That has to be stored somewhere, so we store it in a field. But it can't be a field of C3, because then how do we tellb
to use 123 andg
to use 789 for the value ofy
? They have the same instance ofC3
but two different values fory
.Exercise: Now suppose we have
C4<T>
with a generic methodM<U>
where the lambda is closed over variables of types T and U. Describe the codegen that has to happen now.Exercise: Now suppose we have M return a tuple of delegates, one being
()=>x + y
and the other being(int newY)=>{ y = newY; }
. Describe the codegen for the two delegates.Exercise: Now suppose
M(int y)
returns typeFunc<int, Func<int, int>>
and we returna => b => this.x + y + z + a + b
. Describe the codegen.Exercise: Suppose a lambda closed over both
this
and a local does abase
non-virtual call. It is illegal to do abase
call from code inside a type not directly in the type hierarchy of the virtual method, for security reasons. Describe how to generate verifiable code in this case.Exercise: Put 'em all together. How do you do codegen for multiple nested lambdas with getter and setter lambdas for all locals, parameterized by generic types at the class and method scope, that do
base
calls? Because that's the problem that we actually had to solve.