Why the compiler adds an extra parameter for deleg

I was playing with delegates and noticed that when I create a Func<int,int,int> like the example below:

Func<int, int, int> func1 = (x, y) => x * y;

The signature of the compiler generated method is not what I expected:

As you can see it takes an object for it's first parameter. But when there is a closure:

int z = 10;
Func<int, int, int> func1 = (x, y) => x * y * z;

Everything works as expected:

This is the IL code for the method with extra parameter:

    .method private hidebysig static int32  '<Main>b__0'(object A_0,
                                                     int32 x,
                                                     int32 y) cil managed
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       8 (0x8)
  .maxstack  2
  .locals init ([0] int32 V_0)
  IL_0000:  ldarg.1
  IL_0001:  ldarg.2
  IL_0002:  mul
  IL_0003:  stloc.0
  IL_0004:  br.s       IL_0006
  IL_0006:  ldloc.0
  IL_0007:  ret
} // end of method Program::'<Main>b__0'

It seems that the parameter A_0 is not even used. So, what is the purpose of the object parameter in the first case? Why isn't it added when there is a closure?

Note: If you have a better idea for the title please feel free to edit.

Note 2: I compiled the first code in both Debug and Release modes, there was no difference. But I compiled second in Debug mode to get a closure behaviour since it optimizes the local variable in Release mode.

Note 3: I'm using Visual Studio 2014 CTP.

Edit: This is the generated code for Main in the first case:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       30 (0x1e)
  .maxstack  2
  .locals init ([0] class [mscorlib]System.Func`3<int32,int32,int32> func1)
  IL_0000:  nop
  IL_0001:  ldsfld     class [mscorlib]System.Func`3<int32,int32,int32> ConsoleApplication9.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
  IL_0006:  dup
  IL_0007:  brtrue.s   IL_001c
  IL_0009:  pop
  IL_000a:  ldnull
  IL_000b:  ldftn      int32 ConsoleApplication9.Program::'<Main>b__0'(object,
                                                                       int32,
                                                                       int32)
  IL_0011:  newobj     instance void class [mscorlib]System.Func`3<int32,int32,int32>::.ctor(object,
                                                                                             native int)
  IL_0016:  dup
  IL_0017:  stsfld     class [mscorlib]System.Func`3<int32,int32,int32> ConsoleApplication9.Program::'CS$<>9__CachedAnonymousMethodDelegate1'
  IL_001c:  stloc.0
  IL_001d:  ret
} // end of method Program::Main

Although this may seem highly surprising, a quick search shows that it's for performance reasons.

On a bug report about it, it's pointed out that delegates to that have no implicit this are measurably slower than delegates that do have an implicit this, because delegates that don't have an implicit this need to do a bit of complicated argument shuffling whenever the delegate is invoked:

Suppose you call func1(1, 2). This looks like (pseudo-code, not CIL)

push func1
push 1
push 2
call Func<,,>::Invoke

When this func1 is known to be bound to a static function taking two int values, it then needs to perform the equivalent of either

push arg.1
push arg.2
call method

arg.0 = arg.1
arg.1 = arg.2
jmp method

Whereas when func1 is known to be bound to a static function taking null and two int values, it only needs to perform the equivalent of

arg.0 = null
jmp method

since the environment is already set up perfectly for entering a function taking a reference type and two int values.

Yes, it's a micro-optimisation that typically won't matter, but it's one that everyone benefits from, including those in situations where it does matter.