可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I was just reading about EXCEPT and INTERSECT in the MSDN Library and came across this example of how to use INTERSECT:
USE AdventureWorks2008R2 GO
SELECT ProductID
FROM Production.Product
INTERSECT
SELECT ProductID
FROM Production.WorkOrder ;
--Result: 238 Rows (products that have work orders)
Maybe I'm old-fashioned, but I typically would use the following code to achieve the same result:
SELECT P.ProductID
FROM Production.Product P
INNER JOIN Production.WorkOrder W ON W.ProductID = P.ProductID
Am I missing something, or is INTERSECT the same as INNER JOIN? Is there a performance benefit to using one over the other?
Same question for EXCEPT. How is this:
USE AdventureWorks2008R2;
GO
SELECT ProductID
FROM Production.Product
EXCEPT
SELECT ProductID
FROM Production.WorkOrder ;
--Result: 266 Rows (products without work orders)
different from this:
SELECT P.ProductID
FROM Production.Product P
LEFT JOIN Production.WorkOrder W ON W.ProductID = P.ProductID
WHERE W.ProductID IS NULL
?
回答1:
I'm going to focus on EXCEPT
just because I'm more familiar with it. Also, as a disclaimer, my examples will be in Sqlite, since I'm on a Linux box. However, both Sqlite and SQL Server should support the functionality.
Both INTERSECT
and EXCEPT
are set operators, stemming from the underlying ideas in relational algebra. They operate on distinct values, being set operators.
Your example is simplistic. I'll give a counterexample, using a Sqlite version of the Northwind sample database.
Let's say that you want to get the CustomerIDs of all customers who made an order with EmployeeID of 5, but NOT those who also made an order with EmployeeID of 6. This is simple and natural with an EXCEPT
.
SELECT CustomerID FROM orders
WHERE EmployeeID = 5
EXCEPT
SELECT CustomerID FROM orders
WHERE EmployeeID = 6
This returns 14 rows on my version of Northwind.
Suppose you decide to rewrite this using JOIN
s. Maybe something like this?
SELECT o1.CustomerID
FROM orders o1 INNER JOIN orders o2 ON o1.CustomerID = o2.CustomerID
WHERE o1.EmployeeID = 5 AND o2.EmployeeID != 6
Whoops, 525 rows. Maybe add a DISTINCT
?
SELECT DISTINCT o1.CustomerID
FROM orders o1 INNER JOIN orders o2 ON o1.CustomerID = o2.CustomerID
WHERE o1.EmployeeID = 5 AND o2.EmployeeID != 6
Now it's 28 rows, still much more than what we were getting with EXCEPT
. The reason is that this isn't removing CustomerIDs that have made an order with 6. Rather, it returns all CustomerIDs that have an order with 5 and some EmployeeID other than 6, whether or not they also have an order with EmployeeID 6.
In short, EXCEPT
and INTERSECT
are set operators that compare two queries, returning unique tuples, and certainly have their use.
回答2:
- INTERSECT and EXCEPT are semi-joins
- JOIN is equi-join
So when you join 2 tables that match, say, 5 rows and 3 rows
- JOIN gives 15 rows
- INTERSECT gives 3 rows
EXCEPT is similar to OUTER JOIN for the same reason
While we're on about semi-joins, then mostly
- INTERSECT gives the same results as EXISTS
- EXCEPT gives the same results as NOT EXISTS
The "mostly" comes because both INTERSECT and EXCEPT
- treat NULL differently: For a fully worked example, see this by Paul White aka SQL Kiwi
- apply DISTINCT
Edit, Quick demo of all this
DECLARE @t1 TABLE (t1col INT);
INSERT @t1 VALUES (1), (2), (2), (3), (3), (5), (5);
DECLARE @t2 TABLE (t2col INT);
INSERT @t2 VALUES (1), (2), (3), (4);
SELECT 'INNER JOIN', * FROM @t1 t1 JOIN @t2 t2 ON t1.t1col = t2.t2col -- same both ways
SELECT 't1 INTERSECT t2', * FROM @t1 INTERSECT SELECT 't1 INTERSECT t2', * FROM @t2;
SELECT 't2 INTERSECT t1', * FROM @t2 INTERSECT SELECT 't2 INTERSECT t1', * FROM @t1;
SELECT 't1 EXISTS t2', * FROM @t1 t1
WHERE EXISTS (SELECT * FROM @t2 t2 WHERE t1.t1col = t2.t2col);
SELECT 't2 EXISTS t1', * FROM @t2 t2
WHERE EXISTS (SELECT * FROM @t1 t1 WHERE t1.t1col = t2.t2col);
SELECT 't1 LEFT JOIN t2, IS NULL', * FROM @t1 t1 LEFT JOIN @t2 t2 ON t1.t1col = t2.t2col WHERE t2.t2col IS NULL
SELECT 't2 LEFT JOIN t1, IS NULL', * FROM @t2 t2 LEFT JOIN @t1 t1 ON t1.t1col = t2.t2col WHERE t1.t1col IS NULL
SELECT 't1 EXCEPT t2', * FROM @t1 EXCEPT SELECT 't1 EXCEPT t2', * FROM @t2;
SELECT 't2 EXCEPT t1', * FROM @t2 EXCEPT SELECT 't2 EXCEPT t1', * FROM @t1;
SELECT 't1 NOT EXISTS t2', * FROM @t1 t1
WHERE NOT EXISTS (SELECT * FROM @t2 t2 WHERE t1.t1col = t2.t2col);
SELECT 't2 NOT EXISTS t1', * FROM @t2 t2
WHERE NOT EXISTS (SELECT * FROM @t1 t1 WHERE t1.t1col = t2.t2col);
Update: Feb 2013. Added extra column to describe the operation
回答3:
Your examples of your "equivalent" queries are wrong - the query with INTERSECT
not always return the same result as INNER JOIN
and the same for EXCEPT
and LEFT JOIN
.
Look at particular example about INTERSECT:
DECLARE @t TABLE(t INT NOT NULL)
DECLARE @x TABLE(x INT NOT NULL)
INSERT @t
VALUES (1), (2), (3)
INSERT @x VALUES(1), (1), (1)
SELECT t FROM @t
INTERSECT SELECT x FROM @x
SELECT t FROM @t
INNER JOIN @x ON x = t
INTERSECT
is more like (but not the same) as IN
clause:
SELECT t FROM @t
WHERE t IN (select x FROM @x)
or as EXISTS
SELECT t FROM @t
WHERE EXISTS (select * FROM @x WHERE x = t)
The same examples you may adapt to EXCEPT
clause.
回答4:
In my opinion EXCEPT
and INTERSECT
are used to do the same things as the JOIN command, but it simpler with tables that don't have Primary Keys, for example:
with INTERSECT
:
SELECT FIRSTNAME,
LASTNAME,
ADDRESSLINE1,
CITY,
STATEPROVINCECODE,
POSTALCODE
FROM MANAGER
EXCEPT
SELECT FIRSTNAME,
LASTNAME,
ADDRESSLINE1,
CITY,
STATEPROVINCECODE,
POSTALCODE
FROM CUSTOMER
And to have the same results with JOIN, you must do:
SELECT M.FIRSTNAME,
M.LASTNAME,
M.ADDRESSLINE1,
M.CITY,
M.STATEPROVINCECODE,
M.POSTALCODE
FROM MANAGER M
WHERE NOT EXISTS (SELECT *
FROM CUSTOMER C
WHERE M.FIRSTNAME = C.FIRSTNAME
AND M.LASTNAME = C.LASTNAME
AND M.ADDRESSLINE1 = C.ADDRESSLINE1
AND M.CITY = C.CITY
AND M.POSTALCODE = C.POSTALCODE)
GROUP BY M.FIRSTNAME,M.LASTNAME,M.ADDRESSLINE1,M.CITY,
M.STATEPROVINCECODE,M.POSTALCODE
More information here.