Garbage Collection not happening even when needed

2019-01-07 11:37发布

问题:

I made a 64-bit WPF test app. With my app running and with Task Manager open, I watch my system memory usage. I see I'm using 2GB, and I have 6GB available.

In my app, I click an Add button to add a new 1GB byte array to a list. I see my system memory usage increases by 1GB. I click Add a total of 6 times, filling the 6GB of memory I had available when I started.

I click a Remove button 6 times to remove each array from the list. The removed byte arrays should not be referenced by any other object in my control.

When I Remove, I don't see my memory go down. But that's OK with me, because I understand that GC is non-deterministic and all that. I figure the GC WILL collect as needed.

So now with memory looking full, but expecting the GC to collect when needed, I Add again. My PC starts slipping in and out of a disk thrashing coma. Why didn't the GC collect? If that wasn't the time to do it, when is?

As a sanity check, I have a button to force GC. When I push that, I quickly get 6GB back. Doesn't that prove my 6 arrays were not being referenced and COULD have been collected had the GC knew/wanted to?

I've read a lot that says I shouldn't call GC.Collect() but if GC doesn't collect in this situation, what else can I do?

    private ObservableCollection<byte[]> memoryChunks = new ObservableCollection<byte[]>();
    public ObservableCollection<byte[]> MemoryChunks
    {
        get { return this.memoryChunks; }
    }

    private void AddButton_Click(object sender, RoutedEventArgs e)
    {
        // Create a 1 gig chunk of memory and add it to the collection.
        // It should not be garbage collected as long as it's in the collection.

        try
        {
            byte[] chunk = new byte[1024*1024*1024];

            // Looks like I need to populate memory otherwise it doesn't show up in task manager
            for (int i = 0; i < chunk.Length; i++)
            {
                chunk[i] = 100;
            }

            this.memoryChunks.Add(chunk);                
        }
        catch (Exception ex)
        {
            MessageBox.Show(string.Format("Could not create another chunk: {0}{1}", Environment.NewLine, ex.ToString()));
        }
    }

    private void RemoveButton_Click(object sender, RoutedEventArgs e)
    {
        // By removing the chunk from the collection, 
        // I except no object has a reference to it, 
        // so it should be garbage collectable.

        if (memoryChunks.Count > 0)
        {
            memoryChunks.RemoveAt(0);
        }
    }

    private void GCButton_Click(object sender, RoutedEventArgs e)
    {
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }

回答1:

As a sanity check, I have a button to force GC. When I push that, I quickly get 6GB back. Doesn't that prove my 6 arrays were not being referenced and COULD have been collected had the GC knew/wanted to?

You are probably better off asking When does the GC automatically collect "garbage" memory?. Off the top of my head:

  • Most commonly, when generation 0 is full or an object allocation won't fit in the available free space.1
  • Somewhat commonly, when allocating a chunk of memory would cause an OutOfMemoryException, a full GC is triggered to first try and reclaim available memory. If not enough contiguous memory is available after the collection, then an OOM exception will be thrown.

When starting a garbage collection, the GC determines what generations need to be collected (0, 0+1, or all). Each generation has a size determined by the GC (it can change as the application runs). If only generation 0 will exceed its budget, that is the only generation whose garbage will be collected. If the objects that survive generation 0 will cause generation 1 to go over its budget, then generation 1 will also be collected and its surviving objects will be promoted to generation 2 (which is the highest generation in Microsoft's implementation). If the budget for generation 2 is exceeded as well, garbage will be collected, but objects can't be promoted to a higher generation, as one doesn't exist.

So, here lies the important information, in the most common way the GC is started, Generation 2 will only be collected if generation 0 and generation 1 are both full. Also, you need to know that objects over 85,000 bytes are not stored in the normal GC heap with generation 0, 1, and 2. It's actually stored in what is called the Large Object Heap (LOH). The memory in the LOH is only freed during a FULL collection (that is, when generation 2 is collected); never when only generations 0 or 1 are being collected.

Why didn't the GC collect? If that wasn't the time to do it, when is?

It should now be obvious why the GC never happened automatically. You're only creating objects on the LOH (keep in mind that int types, the way you've used them, are allocated on the stack and don't have to be collected). You are never filling up generation 0, so a GC never happens.1

You're also running it in 64-bit mode, which means it's unlikely you'll hit the other case I listed above, where a collection occurs when there's not enough memory in the entire application to allocate a certain object. 64-bit applications have a virtual address space limit of 8TB so it would be a while before you hit this case. It's more likely you'll run out of physical memory and page file space before that happens.

Since a GC hasn't happened, windows starts to allocate memory for your application from the available space in the page file.

I've read a lot that says I shouldn't call GC.Collect() but if GC doesn't collect in this situation, what else can I do?

Call GC.Collect() if this kind of code is what you need to write. Better yet, don't write this kind of code outside of testing.

In conclusion, I have not done justice to the topic of automatic garbage collection in the CLR. I recommend reading about it (it's actually very interesting) via msdn blog posts, or as has already been mentioned, Jeffery Richter's excellent book, CLR Via C#, Chapter 21.


1 I'm making the assumption that you understand that the .NET implementation of the GC is a generational garbage collector. In the simplest terms, it means that newly created objects are in a lower numbered generation, i.e. generation 0. As garbage collections are run and it's found that an object that is in a certain generation has a GC root (not "garbage"), it will be promoted to the next generation up. This is a performance improvement since GC can take a long time and hurt performance. The idea is that objects in higher generations generally have a longer life and will be around in the application longer, so it doesn't need to check that generation for garbage as much as the lower generations. You can read more in this wikipedia article. You'll notice it's also called a ephemeral GC.

2 If you don't believe me, after you remove one of the chunks, have a function that creates a whole bunch of random strings or objects (I'd recommend against arrays of primitives for this test) and you'll see after you reach a certain amount of space, a full GC will occur, freeing that memory you had allocated in the LOH.



回答2:

This is going on the LOH (Large Object Heap). These are only cleared when a Generation 2 collection is performed. As Hans just said in his comment, you'll need more RAM if this is real code.

For giggles you can call GC.GetGeneration(chunk) to see that it will return 2.

Please see CLR via C#, 3rd edition by Jeffrey Richter (page 588).



回答3:

For real code that needs to allocate huge amount of data and than release it consider manually calling GC (GC.Collect best practice question ) when you are done with huge object.

You can also shift objects from LOH to normal heap by making you allocations in smaller (less than 80K) chunks.