How to left outer join with extra condition in Dja

2020-05-26 16:11发布

问题:

I have these three models:

class Track(models.Model):
    title = models.TextField()
    artist = models.TextField()

class Tag(models.Model):
    name = models.CharField(max_length=50)

class TrackHasTag(models.Model):
    track = models.ForeignKey('Track', on_delete=models.CASCADE)
    tag = models.ForeignKey('Tag', on_delete=models.PROTECT)

And I want to retrieve all Tracks that are not tagged with a specific tag. This gets me what I want: Track.objects.exclude(trackhastag__tag_id='1').only('id') but it's very slow when the tables grow. This is what I get when printing .query of the queryset:

SELECT "track"."id" 
FROM   "track" 
WHERE  NOT ( "track"."id" IN (SELECT U1."track_id" AS Col1 
                              FROM   "trackhastag" U1 
                              WHERE  U1."tag_id" = 1) )

I would like Django to send this query instead:

SELECT "track"."id" 
FROM   "track" 
       LEFT OUTER JOIN "trackhastag" 
                    ON "track"."id" = "trackhastag"."track_id" 
                       AND "trackhastag"."tag_id" = 1 
WHERE  "trackhastag"."id" IS NULL; 

But haven't found a way to do so. Using a Raw Query is not really an option as I have to filter the resulting queryset very often.

The cleanest workaround I have found is to create a view in the database and a model TrackHasTagFoo with managed = False that I use to query like: Track.objects.filter(trackhastagfoo__isnull=True). I don't think this is an elegant nor sustainable solution as it involves adding Raw SQL to my migrations to mantain said view.

This is just one example of a situation where we need to do this kind of left join with an extra condition, but the truth is that we are facing this problem in more parts of our application.

Thanks a lot!

回答1:

As mentioned in Django #29555 you can use FilteredRelation for this purpose since Django 2.0.

Track.objects.annotate(
    has_tag=FilteredRelation(
        'trackhastag', condition=Q(trackhastag__tag=1)
    ),
).filter(
    has_tag__isnull=True,
)


回答2:

Using filters is better than exclude... because wit exclude they will get the entire query first and only than excluding the itens you dont want, while filter get only what you want Like you said Track.objects.filter(trackhastagfoo__isnull=True) is better than Exclude one.

Suggestion: You trying to manually do one ManyToMany Relations, as Mohammad said, why you dont try use ManyToManyField? is more easy to use

Maybe this answer your question: Django Left Outer Join



回答3:

Enric, why you did not use many to many relation

class Track(models.Model):
    title = models.TextField()
    artist = models.TextField()
    tags = models.ManyToManyField(Tag)

class Tag(models.Model):
    name = models.CharField(max_length=50)

And for your question

Track.objects.filter(~Q(tags__id=1))


回答4:

What about queryset extras? They do not break ORM and can be further filtered (vs RawSQL)

from django.db.models import Q

Track.objects.filter(
  # work around to force left outer join
  Q(trackhastag__isnull=True) | Q(trackhastag__isnull=False)
).extra(
    # where parameters are “AND”ed to any other search criteria
    # thus we need to account for NULL
    where=[
        '"app_trackhastag"."id" <> %s or "app_trackhastag"."id" is NULL'
    ],
    params=[1],
)

produces this somewhat convoluted query:

SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE (
    ("app_trackhastag"."id" IS NULL OR "app_trackhastag"."id" IS NOT NULL) AND
    ("app_trackhastag"."id" <> 1 or "app_trackhastag"."id" is NULL)
)

Rationale

Step 1

One straight forward way to have a left outer join with queryset is the following:

Track.objects.filter(trackhastag__isnull=True)

which gives:

SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE "app_trackhastag"."id" IS NULL

Step 2

Realize that once step 1 is done (we have a left outer join), we can leverage queryset's extra:

Track.objects.filter(
    trackhastag__isnull=True
).extra(
    where=['"app_trackhastag"."id" <> %s'],
    params=[1],
)

which gives:

SELECT "app_track"."id", "app_track"."title", "app_track"."artist"
FROM "app_track"
LEFT OUTER JOIN "app_trackhastag"
ON ("app_track"."id" = "app_trackhastag"."track_id")
WHERE (
    "app_trackhastag"."id" IS NULL AND
    ("app_trackhastag"."id" <> 1)
)

Step 3

Playing around extra limitations (All where parameters are “AND”ed to any other search criteria) to come up with final solution above.