At least one X but no Ys Query

2020-03-04 07:39发布

问题:

I come across this pattern occasionally and I haven't found a terribly satisfactory way to solve it.

Say I have a employee table and an review table. Each employee can have more than one review. I want to find all the employees who have at least one "good" review but no "bad" reviews.

I haven't figured out how to make subselects work without knowing the employee ID before hand and I haven't figured out the right combination of joins to make this happen.

Is there a way to do this WITHOUT stored procedures, functions or bringing the data server side? I've gotten it to work with those but I'm sure there's another way.

回答1:

Since you haven't posted your DB Structure, I made some assumptions and simplifications (regarding the rating column, which probably is number and not a character field). Adjust accordingly.

Solution 1: Using Joins

select distinct e.EmployeeId, e.Name
from employee e
left join reviews r1 on e.EmployeeId = r1.EmployeeId and r1.rating = 'good'
left join reviews r2 on e.EmployeeId = r2.EmployeeId and r1.rating = 'bad'
where r1.ReviewId is not null --meaning there's at least one
and r2.ReviewId is null --meaning there's no bad review

Solution 2: Grouping By and Filtering with Conditional Count

select e.EmployeeId, max(e.Name) Name
from employee e
left join reviews r on e.EmployeeId = r.EmployeeId
group by e.EmployeeId
having count(case r.rating when 'good' then 1 else null end) > 0
and  count(case r.rating when 'bad' then 1 else null end) = 0

Both solutions are SQL ANSI compatible, which means both work with any RDBMS flavor that fully support SQL ANSI standards (which is true for most RDBMS).

As pointed out by @onedaywhen, the code will not work in MS Access (have not tested, I'm trusting in his expertise on the subject).

But I have one saying on this (which might make some people upset): I hardly consider MS Access a RDBMS. I have worked with it in the past. Once you move on (Oracle, SQL Server, Firebird, PostGreSQL, MySQL, you name it), you do not ever want to come back. Seriously.



回答2:

The question -- return rows on side A based on nonexistence of a match in B -- (employees with No "Bad" reviews) describes an "anti-semi join". There are numerous ways to accomplish this kind of query, at least 5 I've discovered in MS Sql 2005 and above.

I know this solution works in MSSQL 2000 and above, and is the most efficient out of the 5 ways I've tried in MS Sql 2005 and 2008. I am not sure if it will work in MySQL, but it should, as it reflects a rather common set operation.

Note, the IN clause gives the subquery access to the employee table in the outer scope.

SELECT EE.*
FROM   employee EE
WHERE
    EE.EmpKey IN (
      SELECT RR.EmpKey
      FROM   review RR
      WHERE  RR.EmpKey = EE.EmpKey
        AND  RR.ScoreCategory = 'good'
    ) 
  AND
    EE.EmpKey NOT IN (
      SELECT  RR.EmpKey 
      FROM    review RR
      WHERE   RR.EmpKey = EE.EmpKey
        AND   RR.ScoreCategory = 'bad'
    )


回答3:

It's possible. The particular syntax depends on how you store 'good' and 'bad' reviews.

Suppose you had a classification column in review that had values 'good' and 'bad'.

Then you could do:

SELECT employee.*
FROM employee
JOIN review 
 ON employee.id=review.employee_id
GROUP BY employee.id
HAVING SUM(IF(classification='good',1,0))>0 -- count up #good reviews, > 0
   AND SUM(IF(classification='bad',1,0))=0  -- count up #bad  reviews, = 0.


回答4:

SELECT ???? FROM employee,review
WHERE employees.id = review.id
GROUP BY employees.id
HAVING SUM(IF(review='good',1,0)) > 1 AND SUM(IF(review='bad',1,0)) = 0