I have problem when executing this code:
SELECT * FROM tblpm n
WHERE date_updated=(SELECT MAX(date_updated)
FROM tblpm GROUP BY control_number
HAVING control_number=n.control_number)
Basically, I want to return the most recent date for each control number. The query above returns correct output but it takes 37secs. before the output was shown.
Is there any other sql clause or command that can execute faster than the query above?
Thanks in advance.
Putting the subquery in the WHERE clause and restricting it to n.control_number means it runs the subquery many times. This is called a correlated subquery, and it's often a performance killer.
It's better to run the subquery once, in the FROM clause, to get the max date per control number.
SELECT n.*
FROM tblpm n
INNER JOIN (
SELECT control_number, MAX(date_updated) AS date_updated
FROM tblpm GROUP BY control_number
) AS max USING (control_number, date_updated);
There's no need to group in that subquery... a where clause would suffice:
SELECT * FROM tblpm n
WHERE date_updated=(SELECT MAX(date_updated)
FROM tblpm WHERE control_number=n.control_number)
Also, do you have an index on the 'date_updated' column? That would certainly help.
Another way that doesn't use group by:
SELECT * FROM tblpm n
WHERE date_updated=(SELECT date_updated FROM tblpm n
ORDER BY date_updated desc LIMIT 1)
Fast and easy with HAVING:
SELECT * FROM tblpm n
FROM tblpm GROUP BY control_number
HAVING date_updated=MAX(date_updated);
In the context of HAVING
, MAX
finds the max of each group. Only the latest entry in each group will satisfy date_updated=max(date_updated)
. If there's a tie for latest within a group, both will pass the HAVING
filter, but GROUP BY
means that only one will appear in the returned table.