Fetch the row which has the Max value for a column-第4页回答

Table:

UserId, Value, Date.

I want to get the UserId, Value for the max(Date) for each UserId. That is, the Value for each UserId that has the latest date. Is there a way to do this simply in SQL? (Preferably Oracle)

Update: Apologies for any ambiguity: I need to get ALL the UserIds. But for each UserId, only that row where that user has the latest date.

标签： sql oracle greatest-n-per-group

30条回答

牵手、夕阳

2楼-- · 2018-12-30 23:33

This should be as simple as:

SELECT UserId, Value
FROM Users u
WHERE Date = (SELECT MAX(Date) FROM Users WHERE UserID = u.UserID)

0人赞添加讨论(0) 举报

路过你的时光

3楼-- · 2018-12-30 23:33

select   UserId,max(Date) over (partition by UserId) value from users;

0人赞添加讨论(0) 举报

无与为乐者.

4楼-- · 2018-12-30 23:35

I don't have Oracle to test it, but the most efficient solution is to use analytic queries. It should look something like this:

SELECT DISTINCT
    UserId
  , MaxValue
FROM (
    SELECT UserId
      , FIRST (Value) Over (
          PARTITION BY UserId
          ORDER BY Date DESC
        ) MaxValue
    FROM SomeTable
  )

I suspect that you can get rid of the outer query and put distinct on the inner, but I'm not sure. In the meantime I know this one works.

If you want to learn about analytic queries, I'd suggest reading http://www.orafaq.com/node/55 and ~~http://www.akadia.com/services/ora_analytic_functions.html~~. Here is the short summary.

Under the hood analytic queries sort the whole dataset, then process it sequentially. As you process it you partition the dataset according to certain criteria, and then for each row looks at some window (defaults to the first value in the partition to the current row - that default is also the most efficient) and can compute values using a number of analytic functions (the list of which is very similar to the aggregate functions).

In this case here is what the inner query does. The whole dataset is sorted by UserId then Date DESC. Then it processes it in one pass. For each row you return the UserId and the first Date seen for that UserId (since dates are sorted DESC, that's the max date). This gives you your answer with duplicated rows. Then the outer DISTINCT squashes duplicates.

This is not a particularly spectacular example of analytic queries. For a much bigger win consider taking a table of financial receipts and calculating for each user and receipt, a running total of what they paid. Analytic queries solve that efficiently. Other solutions are less efficient. Which is why they are part of the 2003 SQL standard. (Unfortunately Postgres doesn't have them yet. Grrr...)

0人赞添加讨论(0) 举报

若你有天会懂

5楼-- · 2018-12-30 23:35

The answer here is Oracle only. Here's a bit more sophisticated answer in all SQL:

Who has the best overall homework result (maximum sum of homework points)?

SELECT FIRST, LAST, SUM(POINTS) AS TOTAL
FROM STUDENTS S, RESULTS R
WHERE S.SID = R.SID AND R.CAT = 'H'
GROUP BY S.SID, FIRST, LAST
HAVING SUM(POINTS) >= ALL (SELECT SUM (POINTS)
FROM RESULTS
WHERE CAT = 'H'
GROUP BY SID)

And a more difficult example, which need some explanation, for which I don't have time atm:

Give the book (ISBN and title) that is most popular in 2008, i.e., which is borrowed most often in 2008.

SELECT X.ISBN, X.title, X.loans
FROM (SELECT Book.ISBN, Book.title, count(Loan.dateTimeOut) AS loans
FROM CatalogEntry Book
LEFT JOIN BookOnShelf Copy
ON Book.bookId = Copy.bookId
LEFT JOIN (SELECT * FROM Loan WHERE YEAR(Loan.dateTimeOut) = 2008) Loan 
ON Copy.copyId = Loan.copyId
GROUP BY Book.title) X
HAVING loans >= ALL (SELECT count(Loan.dateTimeOut) AS loans
FROM CatalogEntry Book
LEFT JOIN BookOnShelf Copy
ON Book.bookId = Copy.bookId
LEFT JOIN (SELECT * FROM Loan WHERE YEAR(Loan.dateTimeOut) = 2008) Loan 
ON Copy.copyId = Loan.copyId
GROUP BY Book.title);

Hope this helps (anyone).. :)

Regards, Guus

0人赞添加讨论(0) 举报

有味是清欢

6楼-- · 2018-12-30 23:39

I see many people use subqueries or else vendor-specific features to do this, but I often do this kind of query without subqueries in the following way. It uses plain, standard SQL so it should work in any brand of RDBMS.

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON (t1.UserId = t2.UserId AND t1."Date" < t2."Date")
WHERE t2.UserId IS NULL;

In other words: fetch the row from t1 where no other row exists with the same UserId and a greater Date.

(I put the identifier "Date" in delimiters because it's an SQL reserved word.)

In case if t1."Date" = t2."Date", doubling appears. Usually tables has auto_inc(seq) key, e.g. id. To avoid doubling can be used follows:

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON t1.UserId = t2.UserId AND ((t1."Date" < t2."Date") 
         OR (t1."Date" = t2."Date" AND t1.id < t2.id))
WHERE t2.UserId IS NULL;

Re comment from @Farhan:

Here's a more detailed explanation:

An outer join attempts to join t1 with t2. By default, all results of t1 are returned, and if there is a match in t2, it is also returned. If there is no match in t2 for a given row of t1, then the query still returns the row of t1, and uses NULL as a placeholder for all of t2's columns. That's just how outer joins work in general.

The trick in this query is to design the join's matching condition such that t2 must match the same userid, and a greater date. The idea being if a row exists in t2 that has a greater date, then the row in t1 it's compared against can't be the greatest date for that userid. But if there is no match -- i.e. if no row exists in t2 with a greater date than the row in t1 -- we know that the row in t1 was the row with the greatest date for the given userid.

In those cases (when there's no match), the columns of t2 will be NULL -- even the columns specified in the join condition. So that's why we use WHERE t2.UserId IS NULL, because we're searching for the cases where no row was found with a greater date for the given userid.

0人赞添加讨论(0) 举报

回忆，回不去的记忆

7楼-- · 2018-12-30 23:39

SELECT userid, MAX(value) KEEP (DENSE_RANK FIRST ORDER BY date DESC)
  FROM table
  GROUP BY userid

0人赞添加讨论(0) 举报

Fetch the row which has the Max value for a column

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间