-->

sqlite3 query by max and filter by second factor

2019-02-28 07:18发布

问题:

I have:

TABLE MESSAGES
 message_id | conversation_id | from_user | timestamp  |  message

I want:

1. SELECT * WHERE from_user <> id 
2. GROUP BY conversation_id
3. SELECT in every group row with MAX(timestamp) **(if there are two same timestamps in a group use second factor as highest message_id)** !!!
4. then results SORT BY timestamp 

to have result:

2|145|xxx|10000|message

6|1743|yyy|999|message

7|14|bbb|899|message

with eliminated

1|145|xxx|10000|message    <- has same timestamp(10000) as message(2) belongs to the same conversation(145) but message id is lowest  

5|1743|me|1200|message <- has message_from == me 

example group with same timestamp

i want from this group row 3 but i get row 2 from query

SELECT max(message_timestamp), message_id, message_text, message_conversationId
FROM MESSAGES
WHERE message_from <> 'me'
GROUP BY message_conversationId
ORDER by message_Timestamp DESC

what is on my mind to do union from message_id & timestamp and then get max???

回答1:

Try below sql to achieve your purpose by group by twice.

select m.*
from
Messages m
-- 3. and then joining to get wanted output columns
inner join
(
    --2. then selecting from this max timestamp - and removing duplicates
    select conversation_id, max(timestamp), message_id
    from
    (
        -- 1. first select max message_id in remainings after the removal of duplicates from mix of cv_id & timestamp
        select conversation_id, timestamp, max(message_id) message_id
        from Messages
        where message <> 'me'
        group by conversation_id, timestamp
    ) max_mid
    group by conversation_id
) max_mid_ts on max_mid_ts.message_id = m.message_id
order by m.message_id;

http://goo.gl/MyZjyU



回答2:

Your query is based on non-standard use of GROUP BY (I think SQLite allows that only for compatibility with MySQL) and I'm not at all sure that it will produce determinate results all the time.

Plus it uses MAX() on concatenated columns. Unless you somehow ensure that the two (concatenated) columns have fixed widths, the results will not be accurate for that reason as well.

I would write the query like this:

SELECT 
    m.message_timestamp, 
    m.message_id, 
    m.message_text,
    m.message_conversationId
FROM 
    ( SELECT message_conversationId         -- for every conversation
      FROM messages as m
      WHERE message_from <> 'me'            
      GROUP BY message_conversationId
    ) AS mc
  JOIN 
    messages AS m                           -- join to the messages
      ON  m.message_id =        
          ( SELECT mi.message_id            -- and find one message id
            FROM messages AS mi
            WHERE mi.message_conversationId      -- for that conversation
                  = mc.message_conversationId
              AND mi.message_from <> 'me'
            ORDER BY mi.message_timestamp DESC,  -- according to the
                     mi.message_id DESC          -- specified order
            LIMIT 1                              -- (this is the one part)
          ) ;


回答3:

ok it was more simple than I thought:

basically to change select from:

max(message_timestamp)

to:

max(message_timestamp || message_id)   
or  max(message_timestamp + message_id) 

so it will search for max on concatenation of timestamp and message_id

ps. after a digging - it's working only if message id is growing with timestamp ( order of insertion is preserved )

edit:

edit2 :

so why it works ?

SELECT max(message_timestamp+message_id), message_timestamp, message_id, message_conversationId, message_from,message_text
FROM MESSAGES
WHERE message_conversationId = 1521521
AND message_from <> 'me'
ORDER by message_Timestamp DESC