GROUP_CONCAT change GROUP BY order

2019-02-22 23:56发布

问题:

I have a VIEW (lots of joins) that outputs data ordered by a date ASC. Works as expected.

OUTPUT similar to:

ID date         tag1   other_data
1  25-03-2011   blue   fff   <=
1  26-03-2011   red    ggg
1  27-03-2011   pink   yyy
2  25-03-2011   red    yyy   <=
2  26-03-2011   orange rrr

If I apply a GROUP BY ID. For the other columns MySQL outputs the first found row of each ID. I read this somewhere in te docs.

SELECT * FROM `myVIEW`  
GROUP BY `ID`  
  ID date         tag1  other_data  
  1  25-03-2011   blue   fff   <=
  2  25-03-2011   red    yyy   <=

Now lets add a GROUP_CONCAT(tags1)

SELECT *,CONCAT_GROUP(`tag1`) AS `tags`  
FROM `myVIEW`  
GROUP BY `ID`

Since I apply the CONCAT_GROUP the results get odd. I was expecting:

ID date         tag1   other_data   tags
1  25-03-2011   blue   fff          blue,red,pink
2  25-03-2011   red    yyy          red,orange

The query is returning, for example:

ID date         tag1   other_data   tags
1  26-03-2011   red    ggg          blue,red,pink
2  25-03-2011   red    yyy          red,orange

Looks like GROUP_CONCAT no longer preserves the VIEW order. Is this normal?

回答1:

Looks like GROUP_CONCAT no longer preserves the VIEW order. Is this normal?

Yes, it is normal.

You should not rely, ever, on the order in which ungrouped and unaggregated fields are returned.

GROUP_CONCAT has its own ORDER BY clause which the optimizer takes into account and can change the order in which is parses the records.

To return the first record along with GROUP_CONCAT, use this:

SELECT  m.*, gc
FROM    (
        SELECT  id, MIN(date) AS mindate, GROUP_CONCAT(tags) AS gc
        FROM    myview
        GROUP BY
                id
        ) md
JOIN    m.*
ON      m.id = md.id
        AND m.date = md.mindate


回答2:

How about ordering your GROUP_CONCAT?

SELECT value1, GROUP_CONCAT(value1 ORDER BY date DESC)   
FROM table1  
GROUP BY value1;

That's the syntax you need a presume.



回答3:

That is because mysql does not guarantee what exact rows will be returned for the fields that are not used in aggregation functions or wasn't used to group by.

And to be clear "mature" rdbms (such as postgre, sql server, oracle) do not allow to specify * in GROUP BY (or any fields without aggregation or that was not specified in GROUP BY) - and it is great "limitation".