SQL where joined set must contain all values but m

2020-01-27 08:16发布

问题:

I have three tables offers, sports and the join table offers_sports.

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
end

class Sport < ActiveRecord::Base
  has_and_belongs_to_many :offers
end

I want to select offers that include a given array of sport names. They must contain all of the sports but may have more.

Lets say I have these three offers:

light:
  - "Yoga"
  - "Bodyboarding"
medium:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
all:
  - "Yoga"
  - "Bodyboarding"
  - "Surfing"
  - "Parasailing"
  - "Skydiving"

Given the array ["Bodyboarding", "Surfing"] I would want to get medium and all but not light.

I have tried something along the lines of this answer but I get zero rows in the result:

Offer.joins(:sports)
     .where(sports: { name: ["Bodyboarding", "Surfing"] })
     .group("sports.name")
     .having("COUNT(distinct sports.name) = 2")

Translated to SQL:

SELECT "offers".* 
FROM "offers" 
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"     
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id" 
  WHERE "sports"."name" IN ('Bodyboarding', 'Surfing') 
GROUP BY sports.name 
HAVING COUNT(distinct sports.name) = 2;

An ActiveRecord answer would be nice but I'll settle for just SQL, preferably Postgres compatible.

Data:

offers
======================
id | name
----------------------
1  | light
2  | medium
3  | all
4  | extreme

sports
======================
id | name
----------------------
1  | "Yoga"
2  | "Bodyboarding"
3  | "Surfing"
4  | "Parasailing"
5  | "Skydiving"

offers_sports
======================
offer_id | sport_id
----------------------
1        | 1
1        | 2
2        | 1
2        | 2
2        | 3
3        | 1
3        | 2
3        | 3
3        | 4
3        | 5
4        | 3
4        | 4
4        | 5

回答1:

Group by offer.id, not by sports.name (or sports.id):

SELECT o.*
FROM   sports        s
JOIN   offers_sports os ON os.sport_id = s.id
JOIN   offers        o  ON os.offer_id = o.id
WHERE  s.name IN ('Bodyboarding', 'Surfing') 
GROUP  BY o.id  -- !!
HAVING count(*) = 2;

Assuming the typical implementation:

  • offer.id and sports.id are defined as primary key.
  • sports.name is defined unique.
  • (sport_id, offer_id) in offers_sports is defined unique (or PK).

You don't need DISTINCT in the count. And count(*) is even a bit cheaper, yet.

Related answer with an arsenal of possible techniques:

  • How to filter SQL results in a has-many-through relation

Added by @max (the OP) - this is the above query rolled into ActiveRecord:

class Offer < ActiveRecord::Base
  has_and_belongs_to_many :sports
  def self.includes_sports(*sport_names)
    joins(:sports)
      .where(sports: { name: sport_names })
      .group('offers.id')
      .having("count(*) = ?", sport_names.size)
  end
end


回答2:

One way to do it is using arrays and the array_agg aggregate function.

SELECT "offers".*, array_agg("sports"."name") as spnames 
FROM "offers" 
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"     
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id" 
GROUP BY "offers"."id" HAVING array_agg("sports"."name")::text[] @> ARRAY['Bodyboarding','Surfing']::text[];

returns:

 id |  name  |                      spnames                      
----+--------+---------------------------------------------------
  2 | medium | {Yoga,Bodyboarding,Surfing}
  3 | all    | {Yoga,Bodyboarding,Surfing,Parasailing,Skydiving}
(2 rows)

The @> operator means that the array on the left must contain all the elements from the one on the right, but may contain more. The spnames column is just for show, but you can remove it safely.

There are two things you must be very mindful of with this.

  1. Even with Postgres 9.4 (I haven't tried 9.5 yet) type conversion for comparing arrays is sloppy and often errors out, telling you it can't find a way to convert them to comparable values, so as you can see in the example I've manually cast both sides using ::text[].

  2. I have no idea what the level of support for array parameters is Ruby, nor the RoR framework, so you may end-up having to manually escape the strings (if input by user) and form the array using the ARRAY[] syntax.