This question already has an answer here:
-
Select the 3 most recent records where the values of one column are distinct
8 answers
I have a normal comments table:
| id | comment | date | user | post | status |
I want to fetch my 10 last comments with just 1 comment per user, I mean something like this:
I have this data:
| id | comment | date | user | post |
| 1 | text1 | 2016-04-01| 1 | 1 |
| 2 | text2 | 2016-04-02| 2 | 1 |
| 3 | text3 | 2016-04-03| 1 | 2 |
| 4 | text4 | 2016-04-04| 4 | 3 |
| 5 | text5 | 2016-04-04| 3 | 5 |
| 6 | text6 | 2016-04-05| 2 | 4 |
| 7 | text7 | 2016-04-07| 5 | 3 |
| 8 | text8 | 2016-04-10| 4 | 9 |
| 9 | text9 | 2016-04-11| 3 | 7 |
| 10 | text10 | 2016-04-12| 5 | 8 |
and I want get this :
| 10 | text10 | 2016-04-12| 5 | 8 |
| 9 | text9 | 2016-04-11| 3 | 7 |
| 8 | text8 | 2016-04-10| 4 | 9 |
| 6 | text6 | 2016-04-05| 2 | 4 |
| 3 | text3 | 2016-04-03| 1 | 2 |
I use this sql statement:
SELECT *
FROM `comments`
GROUP BY `user`
ORDER BY MAX(`id`) DESC
This fetches comments with correct order for user but wrong order for comment's date:
| 7 | text7 | 2016-04-07| 5 | 3 |
| 9 | text9 | 2016-04-11| 3 | 7 |
| 4 | text4 | 2016-04-04| 4 | 3 |
| 6 | text6 | 2016-04-05| 2 | 4 |
| 1 | text1 | 2016-04-01| 1 | 1 |
Why it doesn't work with GROUP BY
SELECT *
cannot be used with GROUP BY
; it is invalid SQL. GROUP BY
doesn't select table rows. It creates groups of rows using the provided expressions then, from each group, it generates a new record and computes each column of this new record using the values involved in the expression.
The columns that appear in the SELECT
clause must satisfy one of the following rules:
- also appear in the
GROUP BY
clause;
- are used with
GROUP BY
aggregate functions;
- are functionally-dependent on the columns that appear in the
GROUP BY
clause.
While *
is a shortcut for all the column names of the table(s) used by the query, for your query only the user
column satisfy one of the requirements above.
Before version 5.7.5 MySQL didn't implement the third rule above. It used to accept queries that contain in the SELECT
clause columns that do not follow any of the GROUP BY
requirements. The value returned by the query for such columns was indeterminate.
Since version 5.7.5, MySQL rejects the GROUP BY
queries that do satisfy the requirements.
The solution
Either way, the solution for your problem doesn't involve GROUP BY
. It can be easily accomplished using a LEFT JOIN
with the correct conditions:
SELECT lc.*
FROM comments lc # 'lc' from 'last comment'
LEFT JOIN comments nc # 'nc' from 'newer comment'
ON lc.user = nc.user # both comments belong to the same user
AND lc.id < nc.id # 'nc' is newer than 'lc'
WHERE nc.id IS NULL # there is no 'newer comment'
ORDER BY lc.id DESC
LIMIT 10
How it works
It joins the table comments
, aliased as lc
("lc" from the "last comment" of a user) against itself, aliased as nc
("nc" from "newer comment"). The join clause matches each entry of lc
with all entries of nc
that belong to the same user (lc.user = nc.user
) and are newer (lc.id < nc.id
; I assumed the IDs are assigned sequentially and newer comments have larger values for id
).
The use of LEFT JOIN
ensures that every row of lc
appears in the result of the join, even when no matching row is found in nc
(because there is no newer comment of the same user). In this case, NULL
is used instead of the fields of nc
. The WHERE
clause keeps in the final result set only the rows that have NULL
in nc.id
; this means in the lc
part they contain the most recent comment of each user.
The SELECT
clause contains all the fields of lc
(those of nc
are all NULL
, anyway). The ORDER BY
clause can be used to sort the result set. ORDER BY lc.id DESC
puts the most recent comments first and the LIMIT
clause keeps the result set to a decent size.
You can try this:
SELECT u.*
FROM (
SELECT id,comment,MAX(date) as date,`user`,post,status
FROM comments
GROUP BY `user`) AS t
JOIN comments AS u USING (id)