I have a small multithreaded script running in django and over time its starts using more and more memory. Leaving it for a full day eats about 6GB of RAM and I start to swap.
Following http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks I see this as the most common types (with only 800M of memory used):
(Pdb) objgraph.show_most_common_types(limit=20)
dict 43065
tuple 28274
function 7335
list 6157
NavigableString 3479
instance 2454
cell 1256
weakref 974
wrapper_descriptor 836
builtin_function_or_method 766
type 742
getset_descriptor 562
module 423
method_descriptor 373
classobj 256
instancemethod 255
member_descriptor 218
property 185
Comment 183
__proxy__ 155
which doesn't show anything weird. What should I do now to help debug the memory problems?
Update: Trying some things people are recommending. I ran the program overnight, and when I work up, 50% * 8G == 4G of RAM used.
(Pdb) from pympler import muppy
(Pdb) muppy.print_summary()
types | # objects | total size
========================================== | =========== | ============
unicode | 210997 | 97.64 MB
list | 1547 | 88.29 MB
dict | 41630 | 13.21 MB
set | 50 | 8.02 MB
str | 109360 | 7.11 MB
tuple | 27898 | 2.29 MB
code | 6907 | 1.16 MB
type | 760 | 653.12 KB
weakref | 1014 | 87.14 KB
int | 3552 | 83.25 KB
function (__wrapper__) | 702 | 82.27 KB
wrapper_descriptor | 998 | 77.97 KB
cell | 1357 | 74.21 KB
<class 'pympler.asizeof.asizeof._Claskey | 1113 | 69.56 KB
function (__init__) | 574 | 67.27 KB
That doesn't sum to 4G, nor really give me any big data structured to go fix. The unicode is from a set() of "done" nodes, and the list's look like just random weakref
s.
I didn't use guppy since it required a C extension and I didn't have root so it was going to be a pain to build.
None of the objectI was using have a __del__
method, and looking through the libraries, it doesn't look like django nor the python-mysqldb do either. Any other ideas?
See http://opensourcehacker.com/2008/03/07/debugging-django-memory-leak-with-trackrefs-and-guppy/ . Short answer: if you're running django but not in a web-request-based format, you need to manually run
db.reset_queries()
(and of course have DEBUG=False, as others have mentioned). Django automatically doesreset_queries()
after a web request, but in your format, that never happens.