django annotate and count: how to filter the ones

2019-03-09 03:34发布

问题:

Given a queryset, I add the count of related objects (ModelA) with the following:

qs = User.objets.all()
qs.annotate(modela__count=models.Count('modela'))

However, is there a way to count the ModelA that only meet a criteria? For example, count the ModelA where deleted_at is null?

I have tried two solutions which do not properly work.

1) As @knbk suggested, use filter before you annotate.

qs = User.objects.all().filter(modela__deleted_at__isnull=True).annotate(modela__count=models.Count('modela', distinct=True))

Here is the simplified version of the query generated by django:

SELECT COUNT(DISTINCT "modela"."id") AS "modela__count", "users".*
FROM "users"
LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
WHERE "modela"."deleted_at" IS NULL 
GROUP BY "users"."id"

The problem comes from the WHERE clause. Indeed, there is a LEFT JOIN but the later WHERE conditions forced it to be a plain JOIN. I need to pull the conditions up into the JOIN clause to make it work as intended.

So, instead of

LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
WHERE "modela"."deleted_at" IS NULL

I need the following which works when I execute it directly in plain SQL.

LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" ) 
AND "modela"."deleted_at" IS NULL

How can I change the queryset to get this without doing a raw query?

2) As others suggested, I could use a conditional aggregation.

I tried the following:

qs = User.objects.all().annotate(modela__count=models.Count(Case(When(modela__deleted_at__isnull=True, then=1))))

which turns into the following SQL query:

SELECT COUNT(CASE WHEN "modela"."deleted_at" IS NULL THEN 1 ELSE NULL END) AS "modela__count", "users".*
FROM "users" LEFT OUTER JOIN "modela" ON ( "users"."id" = "modela"."user_id" )
GROUP BY "users"."id"

By doing that, I get all the users (so the LEFT JOIN works properly) but I get "1" (instead of 0) for modela__count for all the users who don't have any ModelA at all. Why do I get 1 and not 0 if there is nothing to count? How can that be changed?

回答1:

In a LEFT JOIN, every field of modela could be NULL because of the absence of corresponding row. So

modela.deleted_at IS NULL

...is not only true for the matching rows, but also true for those users whose have no corresponding modela rows.

I think the right SQL should be:

SELECT COUNT(
    CASE
      WHEN
        `modela`.`user_id` IS NOT NULL  -- Make sure modela rows exist
        AND `modela`.`deleted_at` IS NULL
        THEN 1
      ELSE NULL
    END
  ) AS `modela__count`,
  `users`.*
FROM `users`
LEFT OUTER JOIN `modela`
  ON ( `users`.`id` = `modela`.`user_id` )
GROUP BY `users`.`id`

In Django 1.8 this should be:

from django.db import models
qs = User.objects.all().annotate(
    modela_count=models.Count(
        models.Case(
            models.When(
                modela__user_id__isnull=False,
                modela__deleted_at__isnull=True,
                then=1,
            )
        )
    )
)

Notice:

@YAmikep discovered that a bug in Django 1.8.0 makes the generated SQL have an INNER JOIN instead of a LEFT JOIN, so you will lose rows without corresponding foreign key relationship. Use Django 1.8.2 or above version to fix that.



回答2:

In Django 1.8 I believe this can be achieved with conditional aggregation . However for previous versions I would do it with .extra

ModelA.objects.extra(select={
    'account_count': 'SELECT COUNT(*) FROM account WHERE modela.account_id = account.id AND account.some_prop IS NOT NULL'
})


回答3:

You can simply filter before you annotate:

from django.db.models import Q, Count

qs = ModelA.objects.filter(account__prop1__isnull=False).annotate(account_count=Count('account'))