.NET Garbage Collector mystery

2019-03-08 07:51发布

In my job we had a problem with OutOfMemoryExceptions. I've written a simple piece of code to mimic some behavior, and I've ended up with the following mystery. Look at this simple code which blows up when it runs out of memory.

class Program
{
    private static void Main()
    {
        List<byte[]> list = new List<byte[]>(200000);
        int iter = 0;

        try
        {
            for (;;iter++)
            {
                list.Add(new byte[10000]);
            }
        }
        catch (OutOfMemoryException)
        {
            Console.WriteLine("Iterations: " + iter);
        }
    }
}

On my machine it ended up with

Iterations: 148008

Then I added a GC.Collect call to the loop after each thousand iterations:

            //...
            for (;;iter++)
            {
                list.Add(new byte[10000]);

                if (iter % 1000 == 0)
                    GC.Collect();
            }
            //...

And surprise:

Iterations: 172048

When I called GC.Collect after each 10 iterations, I even got 193716 cycles. There are two strange things:

  1. How can a manual call to GC.Collect have such a severe impact (up to 30% more allocated)?

  2. What the hell can GC collect, when there're no "lost" references (I've even preset the List's capacity)?

4条回答
SAY GOODBYE
2楼-- · 2019-03-08 08:14

I had similar problem in .NET with the difference my byte[] had random sizes.

i tried two ways:

  • write an own heap manager (alloc memory with one large buffer and just adjust pointers)

  • use a memory mapped file (in my opinion the better solution)

If possible you can try out .NET 4.5 http://blogs.msdn.com/b/dotnet/archive/2012/07/20/the-net-framework-4-5-includes-new-garbage-collector-enhancements-for-client-and-server-apps.aspx

查看更多
我只想做你的唯一
3楼-- · 2019-03-08 08:15

A part of the garbage collection process is the compacting phase. During this phase, blocks of allocated memory are moved around to reduce fragementation. When memory is allocated, it isn't always allocated right after the last chunk of allocated memory left off. So you are able to squeeze a bit more in because the garbage collector is making more room by making better use of the available space.

I am trying to run some tests, but my machine can't handle them. Give this a try, it will tell the GC to pin down the objects in memory so they aren't moved around

byte[] b = new byte[10000];
GCHandle.Alloc(b, GCHandleType.Pinned);
list.Add(b);

As for your comment, when the GC moves things around, it isn't wiping anything out, it is just making better use of all memory space. Lets try and over simplify this. When you allocate your byte array the first time, lets say it gets inserted in memory from spot 0 to 10000. The next time you allocate the byte array, it isn't guarenteed to start at 10001, it may start at 10500. So now you have 499 bytes that aren't being used, and won't be used by your application. So when the GC does compacting, it will move the 10500 array to 10001 to be able to use that extra 499 bytes. And again, this is way over simplified.

查看更多
神经病院院长
4楼-- · 2019-03-08 08:15

Depending on the CLR you're using, there may be some Large Object Heap issues involved.

Have a look at this article, which explains the issues with large block allocations (and the list with 200000 items is a large block for sure, the other may or may not be, some arrays seem to be put into LOH when they reach 8k, others after 85k).

http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/

查看更多
beautiful°
5楼-- · 2019-03-08 08:16

The CLR occasionally places arrays on the LOH. If you ever look at a memory dump thru WinDbg, you will see that there are arrays that are under 85,000 bytes. It is undocumented behavior - but that's just the way it works.

You are getting the OutOfMemoryErrors because you are fragmenting the LOH Heap and the LOH Heap is never compacted.

With regard to your question of:

2) What the hell can GC collect, when there're no "lost" references (I've even preset the List's capacity)?

There are overwritten references to the new byte[10000] that you pass to add to the list. A local variable is compiled and assigned to the new byte[10000]. For every iteration in the loop you create a new byte[] with a pre-defined size of 10000 and it is assigned to the local variable. Any previous value for the variable is overwritten and that memory is eligible for collection the next time the GC runs for the generation the variable lives in (in this case, possibly the LOH).

查看更多
登录 后发表回答