Select most occurring value in MySQL

2019-04-09 22:15发布

I'm looking for a way to select the most occurring value, e.g. the person who posted most for each thread;

SELECT MOST_OCCURRING(user_id) FROM thread_posts GROUP BY thread_id

Is there a good way to do this?

3条回答
Animai°情兽
2楼-- · 2019-04-09 22:51

There's numerous examples if you check the questions under the "greatest n per group" tag. But in this case, you don't define how you want to handle ties - what if two or more users have the same count value?

SELECT DISTINCT
       tp.thread_id,
       tp.user_id
  FROM THREAD_POSTS tp
  JOIN (SELECT t.thread_id,
               t.user_id,
               COUNT(t.user_id) AS occurrence,
               CASE
                 WHEN @thread != t.thread_id THEN @rownum := 1
                 ELSE @rownum := @rownum + 1
               END AS rank,
               @thread := t.thread_id
          FROM THREAD_POSTS t
          JOIN (SELECT @rownum := 0, @thread := -1) r
      GROUP BY t.thread_id, t.user_id
      ORDER BY t.thread_id, occurrence DESC) x ON x.thread_id = tp.thread_id
                                              AND x.user_id = tp.user_id
                                              AND x.rank = 1
查看更多
Summer. ? 凉城
3楼-- · 2019-04-09 22:55

If you want a count on a per thread basis, I think you can use a nested query; grouping by thread first and then by user:

SELECT thread_id AS tid,
    (SELECT user_id FROM thread_posts 
        WHERE thread_id = tid 
        GROUP BY user_id
        ORDER BY COUNT(*) DESC
        LIMIT 0,1) AS topUser
FROM thread_posts
GROUP BY thread_id
查看更多
地球回转人心会变
4楼-- · 2019-04-09 23:17

This will tabulate the occurrences of user_id per thread

SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id

But you only wish to select the top user for each thread

SELECT thread_id, user_id, postings
FROM (
  SELECT thread_id, user_id, COUNT(*) as postings
  FROM thread_posts
  GROUP BY thread_id, user_id
)
HAVING postings = max(postings)

which is equivalent to

SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id
HAVING postings = max(postings)

The HAVING keyword is usually used with an aggregation operation to cherry-pick the aggregated output lines that satisfy the conditions in the HAVING clause.

The HAVING clause is different from the the WHERE clause, wherein the HAVING clause filters resultant output of a query. Whereas, the WHERE clause filters on the input data of a query. Since theHAVING clause filters the resultant output of a query, it must appear after the ORDER BY and GROUP BY clauses.

查看更多
登录 后发表回答