Join table on itself - performance

I would like some help with the following join. I have one table (with about 20 million rows) that consists of:

MemberId (Primary Key) | Id (Primary Key) | TransactionDate | Balance

I would like to get the latest Balance for all the customers in one query. I know I could do something like this (I just wrote it from my memory). But this way is terribly slow.

SELECT * 
FROM money 
WHERE money.Id = (SELECT MAX(Id) 
                  FROM money AS m 
                  WHERE m.MemberId = money.MemberId)

Are there any other (faster/smarter) options?

标签： mysql self-join

3条回答

冷血范

2楼-- · 2020-06-28 12:03

In all optimization tutorials and screencasts that I've endured through, joins are always favoured over subqueries. When using a sub-query the sub-query is executed for each comparison, where as with a join only once.

SELECT * 
FROM money m
INNER JOIN (
    SELECT memberId, MAX(id) AS maxid
    FROM money
    GROUP BY memberId
) mmax ON mmax.maxid = m.id AND mmax.memberId = m.memberId

0人赞添加讨论(0) 举报

手持菜刀，她持情操

3楼-- · 2020-06-28 12:16

Other option is to lookup for NULL values in a left join:

SELECT m1.*
  FROM money m1
  LEFT JOIN money m2 ON m2.memberId = m1.memberId AND m2.id > m1.id
 WHERE m2.memberId IS NULL

But of course Umbrella's answer is better.

0人赞添加讨论(0) 举报

ら.Afraid

4楼-- · 2020-06-28 12:17

JOINing is not the best way to go about this. Consider using a GROUP BY clause to sift out the last transaction for each member, like this:

~~SELECT MemberId, MAX(Id), TransactionDate, Balance FROM money GROUP BY MemberId~~

UPDATE

as PKK pointed out, balance will be chosen randomly. It looks like you'll have to perform some sort of join after all. Consider this option:

SELECT MemberId, Id, TransactionDate, Balance FROM money WHERE Id IN (
    SELECT MAX(Id) FROM money GROUP BY MemberId
)

0人赞添加讨论(0) 举报

Join table on itself - performance

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间