I come across this pattern occasionally and I haven't found a terribly satisfactory way to solve it.
Say I have a employee
table and an review
table. Each employee can have more than one review. I want to find all the employee
s who have at least one "good" review but no "bad" reviews.
I haven't figured out how to make subselects work without knowing the employee ID before hand and I haven't figured out the right combination of joins to make this happen.
Is there a way to do this WITHOUT stored procedures, functions or bringing the data server side? I've gotten it to work with those but I'm sure there's another way.
Since you haven't posted your DB Structure, I made some assumptions and simplifications (regarding the rating
column, which probably is number and not a character field). Adjust accordingly.
Solution 1: Using Joins
select distinct e.EmployeeId, e.Name
from employee e
left join reviews r1 on e.EmployeeId = r1.EmployeeId and r1.rating = 'good'
left join reviews r2 on e.EmployeeId = r2.EmployeeId and r1.rating = 'bad'
where r1.ReviewId is not null --meaning there's at least one
and r2.ReviewId is null --meaning there's no bad review
Solution 2: Grouping By and Filtering with Conditional Count
select e.EmployeeId, max(e.Name) Name
from employee e
left join reviews r on e.EmployeeId = r.EmployeeId
group by e.EmployeeId
having count(case r.rating when 'good' then 1 else null end) > 0
and count(case r.rating when 'bad' then 1 else null end) = 0
Both solutions are SQL ANSI compatible, which means both work with any RDBMS flavor that fully support SQL ANSI standards (which is true for most RDBMS).
As pointed out by @onedaywhen, the code will not work in MS Access (have not tested, I'm trusting in his expertise on the subject).
But I have one saying on this (which might make some people upset): I hardly consider MS Access a RDBMS. I have worked with it in the past. Once you move on (Oracle, SQL Server, Firebird, PostGreSQL, MySQL, you name it), you do not ever want to come back. Seriously.
The question -- return rows on side A based on nonexistence of a match in B -- (employees with No "Bad" reviews) describes an "anti-semi join". There are numerous ways to accomplish this kind of query, at least 5 I've discovered in MS Sql 2005 and above.
I know this solution works in MSSQL 2000 and above, and is the most efficient out of the 5 ways I've tried in MS Sql 2005 and 2008. I am not sure if it will work in MySQL, but it should, as it reflects a rather common set operation.
Note, the IN
clause gives the subquery access to the employee
table in the outer scope.
SELECT EE.*
FROM employee EE
WHERE
EE.EmpKey IN (
SELECT RR.EmpKey
FROM review RR
WHERE RR.EmpKey = EE.EmpKey
AND RR.ScoreCategory = 'good'
)
AND
EE.EmpKey NOT IN (
SELECT RR.EmpKey
FROM review RR
WHERE RR.EmpKey = EE.EmpKey
AND RR.ScoreCategory = 'bad'
)
It's possible. The particular syntax depends on how you store 'good' and 'bad' reviews.
Suppose you had a classification
column in review
that had values 'good' and 'bad'.
Then you could do:
SELECT employee.*
FROM employee
JOIN review
ON employee.id=review.employee_id
GROUP BY employee.id
HAVING SUM(IF(classification='good',1,0))>0 -- count up #good reviews, > 0
AND SUM(IF(classification='bad',1,0))=0 -- count up #bad reviews, = 0.
SELECT ???? FROM employee,review
WHERE employees.id = review.id
GROUP BY employees.id
HAVING SUM(IF(review='good',1,0)) > 1 AND SUM(IF(review='bad',1,0)) = 0