Django Q Queries & on the same field?

So here are my models:

class Event(models.Model):
    user = models.ForeignKey(User, blank=True, null=True, db_index=True)
    name = models.CharField(max_length = 200, db_index=True)
    platform = models.CharField(choices = (("ios", "ios"), ("android", "android")), max_length=50)

class User(AbstractUser):
    email = models.CharField(max_length=50, null=False, blank=False, unique=True)

Event is like an analytics event, so it's very possible that I could have multiple events for one user, some with platform=ios and some with platform=android, if a user has logged in on multiple devices. I want to query to see how many users have both ios and android devices. So I wrote a query like this:

User.objects.filter(Q(event__platform="ios") & Q(event__platform="android")).count()

Which returns 0 results. I know this isn't correct. I then thought I would try to just query for iOS users:

User.objects.filter(Q(event__platform="ios")).count()

Which returned 6,717,622 results, which is unexpected because I only have 39,294 users. I'm guessing it's not counting the Users, but counting the Event instances, which seems like incorrect behavior to me. Does anyone have any insights into this problem?

标签： sql django django-queryset django-q

2条回答

叛逆

2楼-- · 2019-07-19 15:53

You can use annotations instead:

django.db.models import Count

User.objects.all().annotate(events_count=Count('event')).filter(events_count=2)

So it will filter out any user that has two events.

You can also use chained filters:

User.objects.filter(event__platform='android').filter(event__platform='ios')

Which first filter will get all users with android platform and the second one will get the users that also have iOS platform.

0人赞添加讨论(0) 举报

仙女界的扛把子

3楼-- · 2019-07-19 16:03

This is generally an answer for a queryset with two or more conditions related to children objects.

Solution: A simple solution with two subqueries is possible, even without any join:

base_subq = Event.objects.values('user_id').order_by().distinct()
user_qs = User.objects.filter(
    Q(pk__in=base_subq.filter(platform="android")) &
    Q(pk__in=base_subq.filter(platform="ios"))
)

The method .order_by() is important if the model Event has a default ordering (see it in the docs about distinct() method).

Notes:

Verify the only SQL request that will be executed: (Simplified by removing "app_" prefix.)

>>> print(str(user_qs.query))
SELECT user.id, user.email FROM user WHERE (
    user.id IN (SELECT DISTINCT U0.user_id FROM event U0 WHERE U0.platform = 'android')
    AND
    user.id IN (SELECT DISTINCT U0.user_id FROM event U0 WHERE U0.platform = 'ios')
)

The function Q() is used because the same condition parameter (pk__in) can not be repeated in the same filter(), but also chained filters could be used instead: .filter(...).filter(...). (The order of filter conditions is not important and it is outweighed by preferences estimated by SQL server optimizer.)
The temporary variable base_subq is an "alias" queryset only to don't repeat the same part of expression that is never evaluated individually.
One join between User (parent) and Event (child) wouldn't be a problem and a solution with one subquery is also possible, but a join with Event and Event (a join with a repeated children object or with two children objects) should by avoided by a subquery in any case. Two subqueries are nice for readability to demonstrate the symmetry of the two filter conditions.

Another solution with two nested subqueries This non symmetric solution can be faster if we know that one subquery (that we put innermost) has a much more restrictive filter than another necessary subquery with a huge set of results. (example if a number of Android users would be huge)

ios_user_ids = (Event.objects.filter(platform="ios")
                .values('user_id').order_by().distinct())
user_ids = (Event.objects.filter(platform="android", user_id__in=ios_user_ids)
            .values('user_id').order_by().distinct())
user_qs = User.objects.filter(pk__in=user_ids)

Verify how it is compiled to SQL: (simplified again by removing app_ prefix and ".)

>>> print(str(user_qs.query))
SELECT user.id, user.email FROM user 
WHERE user.id IN (
    SELECT DISTINCT V0.user_id FROM event V0
    WHERE V0.platform = 'ios' AND V0.user_id IN (
        SELECT DISTINCT U0.user_id FROM event U0
        WHERE U0.platform = 'android'
    )
)

(These solutions work also in an old Django e.g. 1.8. A special subquery function Subquery() exists since Django 1.11 for more complicated cases, but we didn't need it for this simple question.)

0人赞添加讨论(0) 举报

Django Q Queries & on the same field?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间