First off, DEBUG = False
in settings.py, so no, connections['default'].queries
is not growing and growing until it uses up all of memory.
Lets start off with the fact that I've loaded the User
table from django.contrib.auth.models.User
with 10000 users (each named 'test#' where # is a number between 1 and 10000).
Here is the view:
from django.contrib.auth.models import User
from django.http import HttpResponse
import time
def leak(request):
print "loading users"
users = []
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
users += list(User.objects.all())
print "sleeping"
time.sleep(10)
return HttpResponse('')
I've attached the view above to the /leak/
url and start the development server (with DEBUG=False, and I've tested and it has nothing to do with running a development server vs other instances).
After running:
% curl http://localhost:8000/leak/
The runserver process' memory grows to around the size seen from ps aux
output below and then stays at that level.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
dlamotte 25694 11.5 34.8 861384 705668 pts/3 Sl+ 19:11 2:52 /home/dlamotte/tmp/django-mem-leak/env/bin/python ./manage.py runserver
Then running the above curl
command above does not seem to grow the instance's memory usage (which I expected from a true memory leak?), it must be re-using the memory? However, I feel that there is something wrong here that the memory does not get released to the system (however, I understand that it may be better performance that python does NOT release the memory).
Following this, I naively attempted to see if python would release large chunks of memory that it allocated. So I attempt the following from a python session:
>>> a = ''
>>> a += 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' * 10000000
>>> del a
The memory is allocated on the a += ...
line as expected, but when del a
happens, the memory is released. Why is the behavior different for django query sets? Is it something that django is intending to do? Is there a way to change this behavior?
I've literally spent 2 days debugging this behavior with no idea where to go next (I've learned to use guppy AND objgraph which seem to not point to anything interesting that I can figure out).
UPDATE: This could be simply python memory management at work and have nothing to do with Django (suggested on django-users mailing list), but I'd like confirmation by somehow replicating this in python outside of Django.
UPDATE: Using python version 2.6.5