I'm using Django 1.7.7.
I'm wondering if anyone has experienced this. This is my query:
events = Event.objects.filter(
Q(date__gt=my_date) | Q(date__isnull=True)
).filter(type__in=[...]).order_by('date')
When I try to then paginate it
p = Paginator(events, 10)
p.count # Gives 91
event_ids = []
for i in xrange(1, p.count / 10 + 2):
event_ids += [i.id for i in p.page(i)]
print len(event_ids) # Still 91
print len(set(event_ids)) # 75
I noticed that if I removed the .order_by
, I don't get any duplicates. I then tried just .order_by
with Event.objects.all().order_by('date')
which gave no duplicates.
Finally, I tried this:
events = Event.objects.filter(
Q(date__gt=my_date) | Q(date__isnull=True)
).order_by('date')
p = Paginator(events, 10)
events.count() # Gives 131
p.count # Gives 131
event_ids = []
for i in xrange(1, p.count / 10 + 2):
event_ids += [i.id for i in p.page(i)]
len(event_ids) # Gives 131
len(set(event_ids)) # Gives 118
... and there are duplicates. Can anyone explain what's going on?
I dug into the Django source (https://github.com/django/django/blob/master/django/core/paginator.py#L46-L55) and it seems to be something to do with how Django slices the object_list
.
Any help is appreciated. Thanks.
Edit: distinct()
has no affect on the duplicates. There aren't any duplicates in the database and I don't think the query introduces any duplicates ([e for e in events.iterator()]
doesn't produce any duplicates). It's just when the Paginator is slicing.
Edit2: Here's a more complete example
In [1]: from django.core.paginator import Paginator
In [2]: from datetime import datetime, timedelta
In [3]: my_date = timezone.now()
In [4]: 1 events = Event.objects.filter(
2 Q(date__gt=my_date) | Q(date__isnull=True)
3 ).order_by('date')
In [5]: events.count()
Out[5]: 134
In [6]: p = Paginator(events, 10)
In [7]: p.count
Out[7]: 134
In [8]: event_ids = []
In [9]: 1 for i in xrange(1, p.num_pages + 1):
2 event_ids += [j.id for j in p.page(i)]
In [10]: len(event_ids)
Out[10]: 134
In [11]: len(set(event_ids))
Out[11]: 115