Is there a way to query for subset or superset containment with many-to-many fields?
Suppose each Person has a list of birds they want to see, and each Aviary houses a list of birds. How can I make a query to find, for a given Person instance, which Aviaries have every bird on the person's list? And similarly, for a given Person instance, how do I find which Aviaries have only birds on the person's list (but not necessarily all of them).
Here are my Django 1.5 models:
class Bird(models.Model):
name = models.CharField(max_length=255, unique=True)
class Aviary(models.Model):
name = models.CharField(max_length=255, unique=True)
birds = models.ManyToManyField(Bird)
class Person(models.Model):
name = models.CharField(max_length=255, unique=True)
birds_to_see = models.ManyToManyField(Bird)
I know how I would find the aviaries that have at least one of a person's birds, but I don't see how I would adapt that here. (See, for instance:
django queryset for many-to-many field)
If there is a query that does what I want, I am also interested to know if/why it's preferable to doing this more "manually." For example, I could loop over aviaries, extract each aviary's bird list, and see if the person's birds_to_see is a subset or superset of the aviary's bird list:
def find_aviaries(self):
person_birds = set(self.birds_to_see.all())
found_aviaries = []
for aviary in Aviary.objects.all():
aviary_birds = set(aviary.birds.all())
if person_birds.issubset(aviary_birds):
found_aviaries.append(aviary)
return found_aviaries
Any help is appreciated!
A nice solution exists for Django >= 2.0. It is possible to annotate Aviaries by the number of matching Birds and to filter Aviaries that match at least one Bird or a required number.
from django.db.models import Count
...
person_birds = set(self.birds_to_see.all())
aviaries = (
Aviary.objects
.annotate(bird_match_count=Count('birds', filter=Q(birds__in=person_birds)))
.filter(bird_match_count__gt=0)
)
It is then trivial to filter a new queryset by bird_match_count=len(person_birds)
or to filter the original queryset in Python or to sort it by bird_match_count.
Django < 2.0 would require to reference the intermediary model AviaryBirds
and will be more verbose.
Verified by reading the SQL:
>>> print(aviaries.query)
SELECT aviary.id, aviary.name,
COUNT(CASE WHEN aviary_birds.bird_id IN (1,..) THEN aviary_birds.bird_id ELSE NULL END)
AS bird_match_count
FROM aviary LEFT OUTER JOIN aviary_birds ON (aviary.id = aviary_birds.aviary_id)
GROUP BY aviary.id, aviary.name
HAVING
COUNT(CASE WHEN (aviary_birds.bird_id IN (1,..)) THEN aviary_birds.bird_id ELSE NULL END)
> 0
Using a Postgres Subquery Array construct, you can annotate on the ids, and then filter accordingly:
birds = Aviary.birds.through.objects.filter(
aviary=OuterRef('pk')
).values('bird')
aviaries = Aviary.objects.annotate(
bird_ids=SubqueryArray(birds)
).filter(bird_ids__contains=target_bird_ids)
You can also use __contained_by
to go the other way (or __overlap
if you just want any match).
All you need then is a suitable SubqueryArray
class:
class SubqueryArray(django.db.models.expressions.Subquery):
template = 'ARRAY(%(subquery)s)'
output_field = ArrayField(base_field=models.CharField())
You may need to adjust the output field on that, depending upon what your PK field is.