I'll use a concrete, but hypothetical, example.
Each Order normally has only one line item:
Orders:
OrderGUID OrderNumber
========= ============
{FFB2...} STL-7442-1
{3EC6...} MPT-9931-8A
LineItems:
LineItemGUID Order ID Quantity Description
============ ======== ======== =================================
{098FBE3...} 1 7 prefabulated amulite
{1609B09...} 2 32 spurving bearing
But occasionally there will be an order with two line items:
LineItemID Order ID Quantity Description
========== ======== ======== =================================
{A58A1...} 6,784,329 5 pentametric fan
{0E9BC...} 6,784,329 5 differential girdlespring
Normally when showing the orders to the user:
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
INNER JOIN LineItems
ON Orders.OrderID = LineItems.OrderID
I want to show the single item on the order. But with this occasional order containing two (or more) items, the orders would appear be duplicated:
OrderNumber Quantity Description
=========== ======== ====================
STL-7442-1 7 prefabulated amulite
MPT-9931-8A 32 spurving bearing
KSG-0619-81 5 panametric fan
KSG-0619-81 5 differential girdlespring
What I really want is to have SQL Server just pick one, as it will be good enough:
OrderNumber Quantity Description
=========== ======== ====================
STL-7442-1 7 prefabulated amulite
MPT-9931-8A 32 differential girdlespring
KSG-0619-81 5 panametric fan
If I get adventurous, I might show the user, an ellipsis to indicate that there's more than one:
OrderNumber Quantity Description
=========== ======== ====================
STL-7442-1 7 prefabulated amulite
MPT-9931-8A 32 differential girdlespring
KSG-0619-81 5 panametric fan, ...
So the question is how to either
- eliminate "duplicate" rows
- only join to one of the rows, to avoid duplication
First attempt
My first naive attempt was to only join to the "TOP 1" line items:
SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
INNER JOIN (
SELECT TOP 1 LineItems.Quantity, LineItems.Description
FROM LineItems
WHERE LineItems.OrderID = Orders.OrderID) LineItems2
ON 1=1
But that gives the error:
The column or prefix 'Orders' does not match with a table name or alias name used in the query.
Presumably because the inner select doesn't see the outer table.
,Another aproach using common table expression:
or, in the end maybe you would like to show all rows joined?
comma separated version here:
From SQL Server 2012 and onwards I think this will do the trick:
Tried the cross, works nicely, but takes slightly longer. Adjusted line columns to have max and added group which kept speed and dropped the extra record.
Here's the adjusted query:
Correlated sub queries are sub queries that depend on the outer query. It’s like a for loop in SQL. The sub-query will run once for each row in the outer query:
I solve a similar problem by using LEFT JOIN and GROUP BY Orders.OrderNumber. Is there a reason not to do it this way?
I'll answer your answer question with an answer in your own question:
Joining the two together on OrderNumber gives:
Where we wanted it to to return only one row:
This is why I use GROUP BY Orders.OrderNumber which only returns one row per OrderNumber.
EDIT: nevermind, Quassnoi has a better answer.
For SQL2K, something like this: