I have three tables offers
, sports
and the join table offers_sports
.
class Offer < ActiveRecord::Base
has_and_belongs_to_many :sports
end
class Sport < ActiveRecord::Base
has_and_belongs_to_many :offers
end
I want to select offers that include a given array of sport names. They must contain all of the sports
but may have more.
Lets say I have these three offers:
light:
- "Yoga"
- "Bodyboarding"
medium:
- "Yoga"
- "Bodyboarding"
- "Surfing"
all:
- "Yoga"
- "Bodyboarding"
- "Surfing"
- "Parasailing"
- "Skydiving"
Given the array ["Bodyboarding", "Surfing"]
I would want to get medium
and all
but not light
.
I have tried something along the lines of this answer but I get zero rows in the result:
Offer.joins(:sports)
.where(sports: { name: ["Bodyboarding", "Surfing"] })
.group("sports.name")
.having("COUNT(distinct sports.name) = 2")
Translated to SQL:
SELECT "offers".*
FROM "offers"
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id"
WHERE "sports"."name" IN ('Bodyboarding', 'Surfing')
GROUP BY sports.name
HAVING COUNT(distinct sports.name) = 2;
An ActiveRecord answer would be nice but I'll settle for just SQL, preferably Postgres compatible.
Data:
offers
======================
id | name
----------------------
1 | light
2 | medium
3 | all
4 | extreme
sports
======================
id | name
----------------------
1 | "Yoga"
2 | "Bodyboarding"
3 | "Surfing"
4 | "Parasailing"
5 | "Skydiving"
offers_sports
======================
offer_id | sport_id
----------------------
1 | 1
1 | 2
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
3 | 3
3 | 4
3 | 5
4 | 3
4 | 4
4 | 5
Group by offer.id
, not by sports.name
(or sports.id
):
SELECT o.*
FROM sports s
JOIN offers_sports os ON os.sport_id = s.id
JOIN offers o ON os.offer_id = o.id
WHERE s.name IN ('Bodyboarding', 'Surfing')
GROUP BY o.id -- !!
HAVING count(*) = 2;
Assuming the typical implementation:
offer.id
and sports.id
are defined as primary key.
sports.name
is defined unique.
(sport_id, offer_id)
in offers_sports
is defined unique (or PK).
You don't need DISTINCT
in the count. And count(*)
is even a bit cheaper, yet.
Related answer with an arsenal of possible techniques:
- How to filter SQL results in a has-many-through relation
Added by @max (the OP) - this is the above query rolled into ActiveRecord:
class Offer < ActiveRecord::Base
has_and_belongs_to_many :sports
def self.includes_sports(*sport_names)
joins(:sports)
.where(sports: { name: sport_names })
.group('offers.id')
.having("count(*) = ?", sport_names.size)
end
end
One way to do it is using arrays and the array_agg
aggregate function.
SELECT "offers".*, array_agg("sports"."name") as spnames
FROM "offers"
INNER JOIN "offers_sports" ON "offers_sports"."offer_id" = "offers"."id"
INNER JOIN "sports" ON "sports"."id" = "offers_sports"."sport_id"
GROUP BY "offers"."id" HAVING array_agg("sports"."name")::text[] @> ARRAY['Bodyboarding','Surfing']::text[];
returns:
id | name | spnames
----+--------+---------------------------------------------------
2 | medium | {Yoga,Bodyboarding,Surfing}
3 | all | {Yoga,Bodyboarding,Surfing,Parasailing,Skydiving}
(2 rows)
The @>
operator means that the array on the left must contain all the elements from the one on the right, but may contain more. The spnames
column is just for show, but you can remove it safely.
There are two things you must be very mindful of with this.
Even with Postgres 9.4 (I haven't tried 9.5 yet) type conversion for comparing arrays is sloppy and often errors out, telling you it can't find a way to convert them to comparable values, so as you can see in the example I've manually cast both sides using ::text[]
.
I have no idea what the level of support for array parameters is Ruby, nor the RoR framework, so you may end-up having to manually escape the strings (if input by user) and form the array using the ARRAY[]
syntax.