SQL query - Joining a many-to-many relationship, f

2019-09-14 23:20发布

I find myself in a bit of an unworkable situation with a SQL query and I'm hoping that I'm missing something or might learn something new. The structure of the DB2 database I'm working with isn't exactly built for this sort of query, but I'm tasked with this...

Let's say we have Table People and Table Groups. Groups can contain multiple people, and one person can be part of multiple groups. Yeah, it's already messy. In any case, there are a couple of intermediary tables linking the two. The problem is that I need to start with a list of groups, get all of the people in those groups, and then get all of the groups with which the people are affiliated, which would be a superset of the initial group set. This would mean starting with groups, joining down to the people, and then going BACK and joining to the groups again. I need information from both tables in the result set, too, so that rules out a number of techniques.

I have to join this with a number of other tables for additional information and the query is getting enormous, cumbersome, and slow. I'm wondering if there's some way that I could start with People, join it to Groups, and then specify that if a person has one group that is in the supplied set of groups (which is done via a subquery), then ALL groups for that person should be returned. I don't know of a way to make this happen, but I'm thinking (hoping) that there's a relatively clean way to make this happen in SQL.

A quick and dirty example:

SELECT ...
FROM GROUPS g
  JOIN LINKING_A a 
     ON g.GROUPID = a.GROUPID
        AND GROUPID IN (subquery)
  JOIN LINKING_B b 
     ON a.GROUPLIST = b.GROUPLIST
  JOIN PEOPLE p 
     ON b.PERSONID = p.PERSONID
    --This gets me all people affiliated with groups, 
    -- but now I need all groups affiliated with those people...
  JOIN LINKING_B b2 
     ON p.PERSONID = b2.PERSONID
  JOIN LINKING_A a2 
     ON b2.GROUPLIST = a.GROUPLIST
  JOIN GROUPS g2
     ON a2.GROUPID = g.GROUPID

And then I can return information from p and g2 in the result set. You can see where I'm having trouble. That's a lot of joining on some large tables, not to mention a number of other joins that are performed in this query as well. I need to be able to query by joining PEOPLE to GROUPS, then specify that if any person has an associated group that is in the subquery, it should return ALL groups affiliated with that entry in PEOPLE. I'm thinking that GROUP BY might be just the thing, but I haven't used that one enough to really know. So if Bill is part of group A, B, and C, and our subquery returns a set containing Group A, the result set should include Bill along with groups A, B, and C.

3条回答
时光不老,我们不散
2楼-- · 2019-09-15 00:08

I'm not clear why you have both Linking_A and Linking_B. Generally all you should need to represent a many-to-many relationship between two master tables is a single association table with GroupID and PersonId.

I often recommend using "common table expressions" [CTE's] in order to help you break a problem up into chunks that can be easier to understand. CTE's are specified using a WITH clause, which can contain several CTE's before starting the main SELECT query.

I'm going to assume that the list of groups you want to start with is specified by your subquery, so that will be the 1st CTE. The next one selects people who belong to those groups. The final part of the query then selects groups those people belong to, and returns the columns from both master tables.

WITH g1 as
(subquery)
, p1 as
(SELECT p.*
   from g1
   join Linking a1  on g1.groupID=a1.groupID
   join People  p   on p.personID=a1.personID )
SELECT p1.*, g2.*
  from p1
  join Linking a2   on p2.personID=a2.personID
  join Groups  g2   on  g2.groupID=a2.groupID
查看更多
▲ chillily
3楼-- · 2019-09-15 00:20

The following is a shorter way to get all the groups that people in the supplied group list are in. Does this help?

Select g.*
From Linking_B b
   Join Linking_B b2
      On b2.PersonId = b.PersonId
   Join Group g
      On g.GroupId = b2.GroupId
Where b.Groupid in (SubQuery)
查看更多
对你真心纯属浪费
4楼-- · 2019-09-15 00:24

I think I'd build the list of people you want to pull records for first, then use that to query out all the groups for those people. This will work across any number of link tables with the appropriate joins added:

with persons_wanted as
(
     --figure out which people are in a group you want to include
     select p.person_key
     from person p
     join link l1
     on p.person_key = l1.person_key
     join groups g
     on l1.group_key = g.group_key
     where g.group name in ('GROUP_I_WANT_PEOPLE_FROM', 'THIS_ONE_TOO')
     group by p.person_key --we only want each person_key once
)
--now pull all the groups for the list of people in at least one group we want
select p.name as person_name, g.name as group_name, ...
from person p
join link l1
on p.person_key = l1.person_key
join groups g
on l1.group_key = g.group_key
where p.person_key in (select person_key from persons_wanted);
查看更多
登录 后发表回答