Consider simple Django models Event
and Participant
:
class Event(models.Model):
title = models.CharField(max_length=100)
class Participant(models.Model):
event = models.ForeignKey(Event, db_index=True)
is_paid = models.BooleanField(default=False, db_index=True)
It's easy to annotate events query with total number of participants:
events = Event.objects.all().annotate(participants=models.Count('participant'))
How to annotate with count of participants filtered by is_paid=True
?
I need to query all events regardless of number of participants, e.g. I don't need to filter by annotated result. If there are 0
participants, that's ok, I just need 0
in annotated value.
The example from documentation doesn't work here, because it excludes objects from query instead of annotating them with 0
.
Update. Django 1.8 has new conditional expressions feature, so now we can do like this:
events = Event.objects.all().annotate(paid_participants=models.Sum(
models.Case(
models.When(participant__is_paid=True, then=1),
default=0,
output_field=models.IntegerField()
)))
Update 2. Django 2.0 has new Conditional aggregation feature, see the accepted answer below.
Just discovered that Django 1.8 has new conditional expressions feature, so now we can do like this:
UPDATE
The sub-query approach which I mention is now supported in Django 1.11 via subquery-expressions.
I prefer this over aggregation (sum+case), because it should be faster and easier to be optimized (with proper indexing).
For older version, the same can be achieved using
.extra
Conditional aggregation in Django 2.0 allows you to further reduce the amount of faff this has been in the past. This will also use Postgres'
filter
logic, which is somewhat faster than a sum-case (I've seen numbers like 20-30% bandied around).Anyway, in your case, we're looking at something as simple as:
There's a separate section in the docs about filtering on annotations. It's the same stuff as conditional aggregation but more like my example above. Either which way, this is a lot healthier than the gnarly subqueries I was doing before.
I would suggest to use the
.values
method of yourParticipant
queryset instead.For short, what you want to do is given by:
A complete example is as follow:
Create 2
Event
s:Add
Participant
s to them:Group all
Participant
s by theirevent
field:Here distinct is needed:
What
.values
and.distinct
are doing here is that they are creating two buckets ofParticipant
s grouped by their elementevent
. Note that those buckets containParticipant
.You can then annotate those buckets as they contain the set of original
Participant
. Here we want to count the number ofParticipant
, this is simply done by counting theid
s of the elements in those buckets (since those areParticipant
):Finally you want only
Participant
with ais_paid
beingTrue
, you may just add a filter in front of the previous expression, and this yield the expression shown above:The only drawback is that you have to retrieve the
Event
afterwards as you only have theid
from the method above.