Return records distinct on one column but order by

2019-04-15 17:03发布

I am building a Rails 3 app with a pretty standard message model. I would like to return the most recently created message records for each unique conversation_id. It seems like a fairly simple task, but I have not been able to code or find a working solution.
Admittedly, I am not super SQL savvy either (as I have gotten by with mainly Active Record queries thus far). Here is what I'm trying to accomplish.

Sample messages table:

| id | sender_id | receiver_id | conversation_id | subject | body | created_at |
| 1  |     *     |      *      |        1        |    *    |   *  |    16:01   |
| 2  |     *     |      *      |        2        |    *    |   *  |    17:03   |
| 3  |     *     |      *      |        1        |    *    |   *  |    18:04   |
| 4  |     *     |      *      |        3        |    *    |   *  |    19:06   |
| 5  |     *     |      *      |        2        |    *    |   *  |    20:07   |
| 6  |     *     |      *      |        1        |    *    |   *  |    21:08   |
| 7  |     *     |      *      |        4        |    *    |   *  |    22:09   |

The return I would like to get only the most "recent" message record for each conversation_id and ordered by created_at DESC:

| id | sender_id | receiver_id | conversation_id | subject | body | created_at |
| 7  |     *     |      *      |        4        |    *    |   *  |    22:09   |
| 6  |     *     |      *      |        1        |    *    |   *  |    21:08   |
| 5  |     *     |      *      |        2        |    *    |   *  |    20:07   |
| 4  |     *     |      *      |        3        |    *    |   *  |    19:06   |

My original solution in SQLite worked just fine: GROUP BY (conversation_id). However, apparently that solution is unique to SQLite and does not work with Postgres.

Next, I tried: SELECT DISTINCT ON (conversation_id) *. However, this also requires ORDER BY (conversation_id) which I do not want. I want to order by created_at.

1条回答
Evening l夕情丶
2楼-- · 2019-04-15 17:19

DISTINCT ON

If you use DISTINCT ON, you need a subquery for that:

SELECT *
FROM  (
   SELECT DISTINCT ON (conversation_id) *
   FROM   message t
   ORDER  BY conversation_id, created_at DESC
   ) sub
ORDER BY created_at DESC;

The order in the subquery must agree with the columns in the DISTINCT ON clause, so you must wrap it in an outer query to arrive at your desired sort order.

Alternative with row_number()

Similar story, you need a subquery as well:

SELECT id, sender_id, receiver_id, conversation_id, subject, body, created_at
FROM  (
   SELECT *, row_number() OVER (PARTITION BY conversation_id
                                ORDER BY created_at DESC) AS rn
   FROM   message t
   ) sub
WHERE  rn = 1
ORDER  BY created_at DESC;

Also probably slower.

查看更多
登录 后发表回答