I am playing around with the garbage collector in C# (or rather the CLR?) trying to better understand memory management in C#.
I made a small sample program that reads three larger files into a byte[]
buffer. I wanted to see, if
- I actually need to to anything in order to handle memory efficient
- it has any impact when setting the
byte[]
to null after the end of the current iteration - and finally if it would help when forcing a garbage collection via
GC.Collect()
Disclaimer: I measured memory consumption with windows task manager and rounded it. I tried several times, but overall it remained about the same.
Here is my simple sample program:
static void Main(string[] args)
{
Loop();
}
private static void Loop()
{
var list = new List<string>
{
@"C:\Users\Public\Music\Sample Music\Amanda.wma", // Size: 4.75 MB
@"C:\Users\Public\Music\Sample Music\Despertar.wma", // Size: 5.92 MB
@"C:\Users\Public\Music\Sample Music\Distance.wma", // Size: 6.31 MB
};
Console.WriteLine("before loop");
Console.ReadLine();
foreach (string pathname in list)
{
// ... code here ...
Console.WriteLine("in loop");
Console.ReadLine();
}
Console.WriteLine(GC.CollectionCount(1));
Console.WriteLine("end loop");
Console.ReadLine();
}
For each test, I only changed the contents of the foreach
loop. Then I ran the program, at each Console.ReadLine()
I stopped and checked the memory usage of the process in windows task manager. I took notes of the used memory and then continued the program with return (I know about breakpoints ;) ). Just after the end of the loop, I wrote GC.CollectionCount(1)
to the console in order to see how often the GC jumped in if at all.
Results
Test 1:
foreach ( ... )
{
byte[] buffer = File.ReadAllBytes(pathname);
Console.WriteLine ...
}
Result (memory used):
before loop: 9.000 K
1. iteration: 13.000 K
2. iteration: 19.000 K
3. iteration: 25.000 K
after loop: 25.000 K
GC.CollectionCount(1): 2
Test 2:
foreach ( ... )
{
byte[] buffer = File.ReadAllBytes(pathname);
buffer = null;
Console.WriteLine ...
}
Result (memory used):
before loop: 9.000 K
1. iteration: 13.000 K
2. iteration: 14.000 K
3. iteration: 15.000 K
after loop: 15.000 K
GC.CollectionCount(1): 2
Test 3:
foreach ( ... )
{
byte[] buffer = File.ReadAllBytes(pathname);
buffer = null;
GC.Collect();
Console.WriteLine ...
}
Result (memory used):
before loop: 9.000 K
1. iteration: 8.500 K
2. iteration: 8.600 K
3. iteration: 8.600 K
after loop: 8.600 K
GC.CollectionCount(1): 3
What I dont understand:
- In Test 1, the memory increases with each iteration. Therefore I guess that the memory is NOT freed at the end of the loop. But the GC still says it collected 2 times (
GC.CollectionCount
). How so? - In Test 2, it obviously helps that
buffer
is set tonull
. The memory is lower then in Test 2. But why doesGC.CollectionCount
output 2 and not 3? And why is the memory usage not as low as in Test 3? - Test 3 uses the least memory. I would say it is so because 1. the reference to the memory is removed (
buffer
is set tonull
) and therefore when the garbage collector is called viaGC.Collect()
it can free the memory. Seems pretty clear.
If anyone with more experience could shed some light on some of the points above, it would really help me. Pretty interesting topic imho.
Looking at the fact you are reading in entire WMA files into an array, I'd say those array objects are being allocated in the Large Object Heap. This is a seperate heap that's managed in a more malloc-type way (because compacting garbage collection isn't efficient at dealing with large objects).
Space in the Large Object Heap is collected according to different rules and it doesn't count towards the main generation count and that'll be way you're not seeing a difference in the number of collections between tests 1 and 2 even though the memory is being re-used (all that's being collected there is the Array object, not the underlying bytes). In Test 3 you are forcing a collection each time round the loop - the Large Object Heap is being included in that so the memory useage of the process does not increase.
Give you a link that I feel may be useful to you.
http://msdn.microsoft.com/en-us/magazine/ee309515.aspx
-Joe Yu
TaskManager is not the best tool for this. Use the CLR Profiler or for something simple, use WriteLine to show
GC.GetTotalMemory()
.The main purpose of the GC is allocating and de-allocating large numbers of small objects. If you want to study it, write something that creates a lot of (smallish) string or so. Make sure you know what a 'Generational GC' means.
Your current experiment is exercising the Large Object Heap (LOH) which has a whole other set of rules and concerns.
The memory usage your viewing via the task manager is for the process. Remember the CLR manages memory on behalf of your application, so you will typically not see the usage of the GC heap reflected directly in the process memory usage.
Allocating and freeing memory is not free so obviously the CLR will try to optimize this to reduce the cost. Thus when objects are collected from the heap you may or may not see memory released to the OS as well.