Python: Memory leak debugging

2019-01-21 07:02发布

I have a small multithreaded script running in django and over time its starts using more and more memory. Leaving it for a full day eats about 6GB of RAM and I start to swap.

Following http://www.lshift.net/blog/2008/11/14/tracing-python-memory-leaks I see this as the most common types (with only 800M of memory used):

(Pdb)  objgraph.show_most_common_types(limit=20)
dict                       43065
tuple                      28274
function                   7335
list                       6157
NavigableString            3479
instance                   2454
cell                       1256
weakref                    974
wrapper_descriptor         836
builtin_function_or_method 766
type                       742
getset_descriptor          562
module                     423
method_descriptor          373
classobj                   256
instancemethod             255
member_descriptor          218
property                   185
Comment                    183
__proxy__                  155

which doesn't show anything weird. What should I do now to help debug the memory problems?

Update: Trying some things people are recommending. I ran the program overnight, and when I work up, 50% * 8G == 4G of RAM used.

(Pdb) from pympler import muppy
(Pdb) muppy.print_summary()
                                     types |   # objects |   total size
========================================== | =========== | ============
                                   unicode |      210997 |     97.64 MB
                                      list |        1547 |     88.29 MB
                                      dict |       41630 |     13.21 MB
                                       set |          50 |      8.02 MB
                                       str |      109360 |      7.11 MB
                                     tuple |       27898 |      2.29 MB
                                      code |        6907 |      1.16 MB
                                      type |         760 |    653.12 KB
                                   weakref |        1014 |     87.14 KB
                                       int |        3552 |     83.25 KB
                    function (__wrapper__) |         702 |     82.27 KB
                        wrapper_descriptor |         998 |     77.97 KB
                                      cell |        1357 |     74.21 KB
  <class 'pympler.asizeof.asizeof._Claskey |        1113 |     69.56 KB
                       function (__init__) |         574 |     67.27 KB

That doesn't sum to 4G, nor really give me any big data structured to go fix. The unicode is from a set() of "done" nodes, and the list's look like just random weakrefs.

I didn't use guppy since it required a C extension and I didn't have root so it was going to be a pain to build.

None of the objectI was using have a __del__ method, and looking through the libraries, it doesn't look like django nor the python-mysqldb do either. Any other ideas?

7条回答
Fickle 薄情
2楼-- · 2019-01-21 07:49

See http://opensourcehacker.com/2008/03/07/debugging-django-memory-leak-with-trackrefs-and-guppy/ . Short answer: if you're running django but not in a web-request-based format, you need to manually run db.reset_queries() (and of course have DEBUG=False, as others have mentioned). Django automatically does reset_queries() after a web request, but in your format, that never happens.

查看更多
登录 后发表回答