I have these three models:
class Track(models.Model):
title = models.TextField()
artist = models.TextField()
class Tag(models.Model):
name = models.CharField(max_length=50)
class TrackHasTag(models.Model):
track = models.ForeignKey('Track', on_delete=models.CASCADE)
tag = models.ForeignKey('Tag', on_delete=models.PROTECT)
And I want to retrieve all Tracks that are not tagged with a specific tag. This gets me what I want: Track.objects.exclude(trackhastag__tag_id='1').only('id')
but it's very slow when the tables grow. This is what I get when printing .query
of the queryset:
SELECT "track"."id"
FROM "track"
WHERE NOT ( "track"."id" IN (SELECT U1."track_id" AS Col1
FROM "trackhastag" U1
WHERE U1."tag_id" = 1) )
I would like Django to send this query instead:
SELECT "track"."id"
FROM "track"
LEFT OUTER JOIN "trackhastag"
ON "track"."id" = "trackhastag"."track_id"
AND "trackhastag"."tag_id" = 1
WHERE "trackhastag"."id" IS NULL;
But haven't found a way to do so. Using a Raw Query is not really an option as I have to filter the resulting queryset very often.
The cleanest workaround I have found is to create a view in the database and a model TrackHasTagFoo
with managed = False
that I use to query like: Track.objects.filter(trackhastagfoo__isnull=True)
. I don't think this is an elegant nor sustainable solution as it involves adding Raw SQL to my migrations to mantain said view.
This is just one example of a situation where we need to do this kind of left join with an extra condition, but the truth is that we are facing this problem in more parts of our application.
Thanks a lot!
As mentioned in Django #29555 you can use FilteredRelation
for this purpose since Django 2.0.
Track.objects.annotate(
has_tag=FilteredRelation(
'trackhastag', condition=Q(trackhastag__tag=1)
),
).filter(
has_tag__isnull=True,
)
Using filters is better than exclude... because wit exclude they will get the entire query first and only than excluding the itens you dont want, while filter get only what you want Like you said Track.objects.filter(trackhastagfoo__isnull=True) is better than Exclude one.
Suggestion: You trying to manually do one ManyToMany Relations, as Mohammad said, why you dont try use ManyToManyField? is more easy to use
Maybe this answer your question: Django Left Outer Join
Enric, why you did not use many to many relation
class Track(models.Model):
title = models.TextField()
artist = models.TextField()
tags = models.ManyToManyField(Tag)
class Tag(models.Model):
name = models.CharField(max_length=50)
And for your question
Track.objects.filter(~Q(tags__id=1))
What about queryset extras? They do not break ORM and can be further filtered (vs RawSQL)
from django.db.models import Q
Track.objects.filter(
# work around to force left outer join
Q(trackhastag__isnull=True) | Q(trackhastag__isnull=False)
).extra(
# where parameters are “AND”ed to any other search criteria
# thus we need to account for NULL
where=[
'"app_trackhastag"."id" <> %s or "app_trackhastag"."id" is NULL'
],
params=[1],
)
produces this somewhat convoluted query:
SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE (
("app_trackhastag"."id" IS NULL OR "app_trackhastag"."id" IS NOT NULL) AND
("app_trackhastag"."id" <> 1 or "app_trackhastag"."id" is NULL)
)
Rationale
Step 1
One straight forward way to have a left outer join with queryset is the following:
Track.objects.filter(trackhastag__isnull=True)
which gives:
SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE "app_trackhastag"."id" IS NULL
Step 2
Realize that once step 1 is done (we have a left outer join), we can leverage
queryset's extra:
Track.objects.filter(
trackhastag__isnull=True
).extra(
where=['"app_trackhastag"."id" <> %s'],
params=[1],
)
which gives:
SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE (
"app_trackhastag"."id" IS NULL AND
("app_trackhastag"."id" <> 1)
)
Step 3
Playing around extra
limitations (All where parameters are “AND”ed to any other search criteria) to come up with final solution above.