SQL Server: How to Join to first row

2019-01-02 21:21发布

I'll use a concrete, but hypothetical, example.

Each Order normally has only one line item:

Orders:

OrderGUID   OrderNumber
=========   ============
{FFB2...}   STL-7442-1      
{3EC6...}   MPT-9931-8A

LineItems:

LineItemGUID   Order ID Quantity   Description
============   ======== ========   =================================
{098FBE3...}   1        7          prefabulated amulite
{1609B09...}   2        32         spurving bearing

But occasionally there will be an order with two line items:

LineItemID   Order ID    Quantity   Description
==========   ========    ========   =================================
{A58A1...}   6,784,329   5          pentametric fan
{0E9BC...}   6,784,329   5          differential girdlespring 

Normally when showing the orders to the user:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID

I want to show the single item on the order. But with this occasional order containing two (or more) items, the orders would appear be duplicated:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         spurving bearing
KSG-0619-81   5          panametric fan
KSG-0619-81   5          differential girdlespring

What I really want is to have SQL Server just pick one, as it will be good enough:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan

If I get adventurous, I might show the user, an ellipsis to indicate that there's more than one:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan, ...

So the question is how to either

  • eliminate "duplicate" rows
  • only join to one of the rows, to avoid duplication

First attempt

My first naive attempt was to only join to the "TOP 1" line items:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN (
       SELECT TOP 1 LineItems.Quantity, LineItems.Description
       FROM LineItems
       WHERE LineItems.OrderID = Orders.OrderID) LineItems2
    ON 1=1

But that gives the error:

The column or prefix 'Orders' does not match with a table name or alias name used in the query.

Presumably because the inner select doesn't see the outer table.

11条回答
SAY GOODBYE
2楼-- · 2019-01-02 21:56

,Another aproach using common table expression:

with firstOnly as (
    select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lp
    FROM Orders
        join LineItems on Orders.OrderID = LineItems.OrderID
) select *
  from firstOnly
  where lp = 1

or, in the end maybe you would like to show all rows joined?

comma separated version here:

  select *
  from Orders o
    cross apply (
        select CAST((select l.Description + ','
        from LineItems l
        where l.OrderID = s.OrderID
        for xml path('')) as nvarchar(max)) l
    ) lines
查看更多
不美不萌又怎样
3楼-- · 2019-01-02 21:57

From SQL Server 2012 and onwards I think this will do the trick:

SELECT DISTINCT
    o.OrderNumber ,
    FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity ,
    FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description
FROM    Orders AS o
    INNER JOIN LineItems AS li ON o.OrderID = li.OrderID
查看更多
姐就是有狂的资本
4楼-- · 2019-01-02 22:00

Tried the cross, works nicely, but takes slightly longer. Adjusted line columns to have max and added group which kept speed and dropped the extra record.

Here's the adjusted query:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description)
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID
Group by Orders.OrderNumber
查看更多
Animai°情兽
5楼-- · 2019-01-02 22:02

Correlated sub queries are sub queries that depend on the outer query. It’s like a for loop in SQL. The sub-query will run once for each row in the outer query:

select * from users join widgets on widgets.id = (
    select id from widgets
    where widgets.user_id = users.id
    order by created_at desc
    limit 1
)
查看更多
ら.Afraid
6楼-- · 2019-01-02 22:03

I solve a similar problem by using LEFT JOIN and GROUP BY Orders.OrderNumber. Is there a reason not to do it this way?

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    LEFT JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID
GROUP BY Orders.OrderNumber

I'll answer your answer question with an answer in your own question:

Orders             LineItems
+-------------+    +---------+----------+---------------+
| OrderNumber |    | OrderID | Quantity | Description   |
+-------------+    +---------+----------+---------------+
| 22586       |    | 22586   | 17       | Trunion       |
+-------------+    | 22586   | 3        | Girdle Spring |
                   +---------+----------+---------------+

Joining the two together on OrderNumber gives:

OrderNumber  Quantity  Description
-----------  --------  -------------
22586        17        Trunion
22586        3         Girdle Spring

2 row(s) affected

Where we wanted it to to return only one row:

OrderNumber  Quantity  Description
-----------  --------  -------------
22586        17        Trunion

1 row(s) affected

This is why I use GROUP BY Orders.OrderNumber which only returns one row per OrderNumber.

查看更多
小情绪 Triste *
7楼-- · 2019-01-02 22:05

EDIT: nevermind, Quassnoi has a better answer.

For SQL2K, something like this:

SELECT 
  Orders.OrderNumber
, LineItems.Quantity
, LineItems.Description
FROM (  
  SELECT 
    Orders.OrderID
  , Orders.OrderNumber
  , FirstLineItemID = (
      SELECT TOP 1 LineItemID
      FROM LineItems
      WHERE LineItems.OrderID = Orders.OrderID
      ORDER BY LineItemID -- or whatever else
      )
  FROM Orders
  ) Orders
JOIN LineItems 
  ON LineItems.OrderID = Orders.OrderID 
 AND LineItems.LineItemID = Orders.FirstLineItemID
查看更多
登录 后发表回答