The case:
I wrote a minimax algorithm in Java, then I ported the code in C++, mantaining the same classes and methods. Then I valgrinded the C++ code until all memory leaks were fixed by putting delete
functions in destructors (when possible). When all leaks were fixed, I tested the algorithm on a tic-tac-toe game, which code was then ported identically in C++. I was sure that the C++ version would have been more performant, however I was surprised seeing that 100 (not random) instances of the game were solved in 125 seconds by C++ version while the Java version solved them in 30 seconds!
Then, using System Monitor I checked the memory usage of the two programs: the Java code memory usage at the end of the test increased roughly 20% while with the C++ version the memory only increased of 5%.
Then, at my eyes, it's clear that this delete
policy is memory saving but kills time performance, which is not what I want, at least in this case. Is there another deletion policy to better design softwares which requires small runtime but allow high memory usage?
malloc and delete have to do more work as
- memory is not allocates in a multi-threaded way
- memory is not allocated continuously in memory, but for regions of free memory.
delete
is performed in the current thread.
- for 64-bit applications you may find the memory alignment is 16 bytes instead of 8 bytes, resulting in more padding per allocation.
Possible solutions in C++
- don't use the heap so much, allocating on the stack is much faster than Java's or C++'s heap, and it is multi-threaded.
- allocate blocks of objects at once if possible e.g. an array of objects is one allocation in C++ instead of N+1 in Java.
- use a multi-threaded allocator. C++ supports multiple allocators.
- use an arena allocation pattern. If you have a large data structure with lots of nodes, you can allocate blocks of N node at a time and when you free such nodes, build a linked list of free node. You need to do this with the nodes themselves. In this approach the whole data structure/arena is deallocated at once.
In short, if you do a literal translation of Java code to C++, it may well be slower as this doesn't take advantage of all the optimisations C++ allows but Java doesn't.
BTW Java 6+ uses 32-bit references for up to 32 GB of memory. If you build a 64-bit C++ application, see if using 32-bit pointers/references/indexes is an option for you.
If you are constantly building a tree (with a lot of small
dynamically allocated nodes) and tearing it down, you are in the
most favorable case for garbage collection. You might want to
use the Boehm collector with C++. (IIRC, Boehm's benchmark,
designed to show that garbage collection can be faster than
manual allocation, does something very similar to this: builds
large trees, then tears them down.)
Alternatively, you could look into some sort of memory pool.
This is not a beginner's technique in C++, but it isn't that
difficult either. Or, rather than deleting the nodes, you put
them onto a free list, to recycle later. (When allocating, you
look first to the free list, and only call new
if it is
empty.) Or combine the two.