Converting IL to C# with no syntactic sugar

2019-09-20 18:39发布

问题:

I'm looking for a program that will show me the lowest level (ie. no syntactic sugar) C# code given IL code.

I tried using .NET Reflector to view a .exe file that contained a simple console app with a foreach loop, hoping to see GetEnumerator(), MoveNext(), Current etc, however it showed it as a foreach loop.

Does such a program exist? Or is it possible to select "no syntactic sugar" in .NET Reflector?

回答1:

Current versions of ILSpy have a sizable set of options for enabling/disabling decompiler transformation features:

...
        static void Main(string[] args) {
            foreach (var arg in args) {
                Console.WriteLine(arg);
            }
        }
...

If needed, you could go further than this by stripping out logic in ICSharpCode.Decompiler.IL.Transforms.* and ICSharpCode.Decompiler.CSharp.StatementBuilder; Perhaps open an issue asking, whether a PR for your changes would be appreciated, as most of these "rawness" settings have been added relatively recently.


A better example with enumerators

A laconic snippet of code

var numbers = new List<int> { 0, 1, 2 };
foreach (var num in numbers) Console.WriteLine(num);

compiles to

System.Collections.Generic.List<int> list = new System.Collections.Generic.List<int>();
list.Add(0);
list.Add(1);
list.Add(2);
System.Collections.Generic.List<int> numbers = list;
System.Collections.Generic.List<int>.Enumerator enumerator = numbers.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        int num = enumerator.Current;
        System.Console.WriteLine(num);
    }
}
finally
{
    ((System.IDisposable)enumerator).Dispose();
}

(as seen with all transformation settings disabled)


Further on for-loops:

As far as compilation goes, for (a; b; c) d is the same as a; while (b) { d; c; } (save for placement of continue-label), so decompilers will take liberty with deciding what kind of loop it might have been based on context similarity between init-statement, condition, and post-statement, so you might even write code by hand

var a = 0;
while (a < args.Length) {
    Console.WriteLine(args[a]);
    a++;
}

that will be detected as a for-loop (for there is no telling in IL)

for (int a = 0; a < args.Length; a++)
{
    System.Console.WriteLine(args[a]);
}


回答2:

On comment to YellowAfterlife OP said:

In your screenshot, it's showing a for loop. But behind the scenes a for loop is not used for a foreach, right? It uses a while loop.

When iterating over an array it does not use an enumerator object. It uses an integer index counter variable instead. You know, like a for loop. The IL uses OpCodes.Br_S and OpCodes.Blt_S, which we could say are "goto". Sure, you could write it as a while loop if you insist.

For test, I wrote this code:

static void Main(string[] args)
{
    var index = 0;
    while (index < args.Length)
    {
        var arg = args[index];
        Console.WriteLine(arg);
        index++;
    }
}

This is what ILSpy output:

private static void Main(string[] args)
{
    for (int index = 0; index < args.Length; index++)
    {
        Console.WriteLine(args[index]);
    }
}

In fact, in the IL, the check was moved after the loop, with a jump to it at the start. Please remember that while loop (unlike the do ... while loop) is supposed to check before. See the IL:

// for (int i = 0; i < args.Length; i++)
IL_0000: ldc.i4.0
IL_0001: stloc.0
// (no C# code)
IL_0002: br.s IL_0010
// loop start (head: IL_0010)
    // Console.WriteLine(args[i]);
    IL_0004: ldarg.0
    IL_0005: ldloc.0
    IL_0006: ldelem.ref
    IL_0007: call void [mscorlib]System.Console::WriteLine(string)
    // for (int i = 0; i < args.Length; i++)
    IL_000c: ldloc.0
    IL_000d: ldc.i4.1
    IL_000e: add
    IL_000f: stloc.0

    // for (int i = 0; i < args.Length; i++)
    IL_0010: ldloc.0
    IL_0011: ldarg.0
    IL_0012: ldlen
    IL_0013: conv.i4
    IL_0014: blt.s IL_0004
// end loop

// (no C# code)
IL_0016: ret

Notice ldlen which gets the length of an array.

You can verify this code on ShatpLab.

The compiler is optimizing access to the array. So, we could argue that the compiler turned my while loop in a for loop.