I'm currently working on a project for medical image processing, that needs a huge amount of memory. Is there anything I can do to avoid heap fragmentation and to speed up access of image data that has already been loaded into memory?
The application has been written in C++ and runs on Windows XP.
EDIT: The application does some preprocessing with the image data, like reformatting, calculating look-up-tables, extracting sub images of interest ... The application needs about 2 GB RAM during processing, of which about 1,5 GB may be used for the image data.
If you are going to be performing operations on a large image matrix, you might want to consider a technique called "tiling". The idea is generally to load the image in memory so that the same contiguous block of bytes would not contain pixels in one line, but rather of a square in 2D space. The rationale behind this is that you would do more operations that are closer to each other in 2D rather than on one scan line.
This is not going to reduce your memory use, but may have a huge impact on page swapping and performance.
Without much more information about the problem (for example language), one thing you can do is to avoid allocation churn by reusing allocations and not allocate, operate and free. Allocator such as dlmalloc handles fragmentation better than Win32 heaps.
If you can isolate exactly those places where you're likely to allocate large blocks, you can (on Windows) directly call VirtualAlloc instead of going through the memory manager. This will avoid fragmentation within the normal memory manager.
This is an easy solution and it doesn't require you to use a custom memory manager.
If you are doing medical image processing it is likely that you are allocating big blocks at a time (512x512, 2-byte per pixel images). Fragmentation will bite you if you allocate smaller objects between the allocations of image buffers.
Writing a custom allocator is not necessarily hard for this particular use-case. You can use the standard C++ allocator for your Image object, but for the pixel buffer you can use custom allocation that is all managed within your Image object. Here's a quick and dirty outline:
This is just one simple idea with lots of room for variation. The main trick is to avoid freeing and reallocating the image pixel buffers.
You might need to implement manual memory management. Is the image data long lived? If not, then you can use the pattern used by apache web server: allocate large amounts of memory and wrap them into memory pools. Pass those pools as the last argument in functions, so they can use the pool to satisfy the need to allocate temporary memory. Once the call chain is finished, all the memory in the pool can should be no longer used, so you can scrub the memory area and used it again. Allocations are fast, since they only mean adding a value to a pointer. Deallocation is really fast, since you will free very large blocks of memory at once.
If your application is multithreaded, you might need to store the pool in thread local storage, to avoid cross-thread communication overhead.
I gues you're using something unmanaged, because in managed platforms the system (garbage collector) takes care of fragmentation.
For C/C++ you can use some other allocator, than the default one. (there were alrady some threads about allocators on stackowerflow).
Also, you can create your own data storage. For example, in the project I'm currently working on, we have a custom storage (pool) for bitmaps (we store them in a large contigous hunk of memory), because we have a lot of them, and we keep track of heap fragmentation and defragment it when the fragmentation is to big.