How does the .NET runtime move memory?

2019-04-06 00:10发布

问题:

It's a well known fact that the .NET garbage collector doesn't just 'delete' the objects on the heap, but also fights memory fragmentation using memory compaction. From what I understand, basically memory is copied to a new place, and the old place is at some point deleted.

My question is: how does this work?

What I'm mostly curious about is the fact that the GC runs in a separate thread, which means that the object we're working on can be moved by the GC while we're executing our code.

Technical details of the question

To illustrate, let me explain my question in more detail:

class Program
{
    private int foo;
    public static void Main(string[] args)
    {
        var tmp = new Program(); // make an object
        if (args.Length == 2)    // depend the outcome on a runtime check
        {
            tmp.foo = 12;        // set value ***
        }
        Console.WriteLine(tmp.foo);
    }
}

In this small example, we create an object and set a simple variable on an object. The point '***' is all that matters for the question: if the address of 'tmp' moves, 'foo' will reference something incorrect and everything will break.

The garbage collector runs in a separate thread. So as far as I know, 'tmp' can be moved during this instruction and 'foo' can end up with the incorrect value. But somehow, magic happens and it doesn't.

As for the disassembler, I noticed that the compiled program really takes the address of 'foo' and moves in the value '12:

000000ae 48 8B 85 10 01 00 00 mov         rax,qword ptr [rbp+00000110h] 
000000b5 C7 40 08 0C 00 00 00 mov         dword ptr [rax+8],0Ch 

I more or less expected to see an indirect pointer here, which can be updated- but apparently the GC works smarter than that.

Further, I don't see any thread synchronization that checks if the object has been moved. So how does the GC update the state in the executing thread?

So, how does this work? And if the GC doesn't move these objects, what is the 'rule' that defines wether or not to move the objects?

回答1:

The .NET GC is (at least partially) a "stop-the-world" GC: it stops managed threads before doing its work, does its work, then restarts managed threads.

The "Workstation" GC can be concurrent (so partially not stop-the-world) but note at https://msdn.microsoft.com/library/ee851764.aspx .

When you are using workstation garbage collection with concurrent garbage collection, the reclaimed objects are not compacted, so the heap size can be the same or larger (fragmentation can make it appear to be larger).

Note that with all the GC, gen0 and gen1 are always stop-the-world. So they can move blocks of memory without problems. Only gen2 can be done in background by some GC with some configurations (this link, the information is a little fragmented all around the page), so there is always a "the-world-is-stopped" moment where the memory that has been freed can be compacted.