I have a Twisted server under load. When the server is under load, memory usage increases, and it is never reclaimed (even when there are no more clients). Next time it goes into high load, memory usage increases again. Here's a snapshot of the situation at that point:
- RSS memory is 400 MB (should be 200MB with usual max number of clients).
- gc.garbage is empty, so there are no uncollectable objects.
- Using objgraph.py shows no obvious candidates for leaks (no notable difference between a normal, healthy process and a leaking process).
- Using pympler shows a few tens of MB (only) used by Python objects (mostly dict, list, str and other native containers).
Valgrind with leak-check=full enabled doesn't show any major leaks (only couple of MBs 'definitively lost') - so C extensions are not the culprit. The total memory also doesn't add up with the 400MB+ shown by top:
==23072== HEAP SUMMARY:
==23072== in use at exit: 65,650,760 bytes in 463,153 blocks
==23072== total heap usage: 124,269,475 allocs, 123,806,322 frees, 32,660,215,602 bytes allocated
The only explanation I can find is that some objects are not tracked by the garbage collector, so that they are not shown by objgraph and pympler, yet use an enormous amount of RAM.
What other tools or solutions do I have? Would compiling the Python interpreter in debug mode help, by using sys.getobjects?