Django filter queryset __in for *every* item in li

2019-01-01 15:23发布

问题:

Let\'s say I have the following models

class Photo(models.Model):
    tags = models.ManyToManyField(Tag)

class Tag(models.Model):
    name = models.CharField(max_length=50)

In a view I have a list with active filters called categories. I want to filter Photo objects which have all tags present in categories.

I tried:

Photo.objects.filter(tags__name__in=categories)

But this matches any item in categories, not all items.

So if categories would be [\'holiday\', \'summer\'] I want Photo\'s with both a holiday and summer tag.

Can this be achieved?

回答1:

Summary:

One option is, as suggested by jpic and sgallen in the comments, to add .filter() for each category. Each additional filter adds more joins, which should not be a problem for small set of categories.

There is the aggregation approach. This query would be shorter and perhaps quicker for a large set of categories.

You also have the option of using custom queries.


Some examples

Test setup:

class Photo(models.Model):
    tags = models.ManyToManyField(\'Tag\')

class Tag(models.Model):
    name = models.CharField(max_length=50)

    def __unicode__(self):
        return self.name

In [2]: t1 = Tag.objects.create(name=\'holiday\')
In [3]: t2 = Tag.objects.create(name=\'summer\')
In [4]: p = Photo.objects.create()
In [5]: p.tags.add(t1)
In [6]: p.tags.add(t2)
In [7]: p.tags.all()
Out[7]: [<Tag: holiday>, <Tag: summer>]

Using chained filters approach:

In [8]: Photo.objects.filter(tags=t1).filter(tags=t2)
Out[8]: [<Photo: Photo object>]

Resulting query:

In [17]: print Photo.objects.filter(tags=t1).filter(tags=t2).query
SELECT \"test_photo\".\"id\"
FROM \"test_photo\"
INNER JOIN \"test_photo_tags\" ON (\"test_photo\".\"id\" = \"test_photo_tags\".\"photo_id\")
INNER JOIN \"test_photo_tags\" T4 ON (\"test_photo\".\"id\" = T4.\"photo_id\")
WHERE (\"test_photo_tags\".\"tag_id\" = 3  AND T4.\"tag_id\" = 4 )

Note that each filter adds more JOINS to the query.

Using annotation approach:

In [29]: from django.db.models import Count
In [30]: Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count(\'tags\')).filter(num_tags=2)
Out[30]: [<Photo: Photo object>]

Resulting query:

In [32]: print Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count(\'tags\')).filter(num_tags=2).query
SELECT \"test_photo\".\"id\", COUNT(\"test_photo_tags\".\"tag_id\") AS \"num_tags\"
FROM \"test_photo\"
LEFT OUTER JOIN \"test_photo_tags\" ON (\"test_photo\".\"id\" = \"test_photo_tags\".\"photo_id\")
WHERE (\"test_photo_tags\".\"tag_id\" IN (3, 4))
GROUP BY \"test_photo\".\"id\", \"test_photo\".\"id\"
HAVING COUNT(\"test_photo_tags\".\"tag_id\") = 2

ANDed Q objects would not work:

In [9]: from django.db.models import Q
In [10]: Photo.objects.filter(Q(tags__name=\'holiday\') & Q(tags__name=\'summer\'))
Out[10]: []
In [11]: from operator import and_
In [12]: Photo.objects.filter(reduce(and_, [Q(tags__name=\'holiday\'), Q(tags__name=\'summer\')]))
Out[12]: []

Resulting query:

In [25]: print Photo.objects.filter(Q(tags__name=\'holiday\') & Q(tags__name=\'summer\')).query
SELECT \"test_photo\".\"id\"
FROM \"test_photo\"
INNER JOIN \"test_photo_tags\" ON (\"test_photo\".\"id\" = \"test_photo_tags\".\"photo_id\")
INNER JOIN \"test_tag\" ON (\"test_photo_tags\".\"tag_id\" = \"test_tag\".\"id\")
WHERE (\"test_tag\".\"name\" = holiday  AND \"test_tag\".\"name\" = summer )


回答2:

Another approach that works, although PostgreSQL only, is using django.contrib.postgres.fields.ArrayField:

Example copied from docs:

>>> Post.objects.create(name=\'First post\', tags=[\'thoughts\', \'django\'])
>>> Post.objects.create(name=\'Second post\', tags=[\'thoughts\'])
>>> Post.objects.create(name=\'Third post\', tags=[\'tutorial\', \'django\'])

>>> Post.objects.filter(tags__contains=[\'thoughts\'])
<QuerySet [<Post: First post>, <Post: Second post>]>

>>> Post.objects.filter(tags__contains=[\'django\'])
<QuerySet [<Post: First post>, <Post: Third post>]>

>>> Post.objects.filter(tags__contains=[\'django\', \'thoughts\'])
<QuerySet [<Post: First post>]>

ArrayField has some more powerful features such as overlap and index transforms.



回答3:

This also can be done by dynamic query generation using Django ORM and some Python magic :)

from operator import and_
from django.db.models import Q

categories = [\'holiday\', \'summer\']
res = Photo.filter(reduce(and_, [Q(tags__name=c) for c in categories]))

The idea is to generate appropriate Q objects for each category and then combine them using AND operator into one QuerySet. E.g. for your example it\'d be equal to

res = Photo.filter(Q(tags__name=\'holiday\') & Q(tags__name=\'summer\'))


回答4:

I use a little function that iterates filters over a list for a given operator an a column name :

def exclusive_in (cls,column,operator,value_list):         
    myfilter = column + \'__\' + operator
    query = cls.objects
    for value in value_list:
        query=query.filter(**{myfilter:value})
    return query  

and this function can be called like that:

exclusive_in(Photo,\'tags__name\',\'iexact\',[\'holiday\',\'summer\'])

it also work with any class and more tags in the list; operators can be anyone like \'iexact\',\'in\',\'contains\',\'ne\',...



回答5:

If we want to do it dynamically, followed the example:

tag_ids = [t1.id, t2.id]
qs = Photo.objects.all()

for tag_id in tag_ids:
    qs = qs.filter(tag__id=tag_id)    

print qs