SQL count if columns

What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column.

I have seen

SELECT
    sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
    sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
    sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
    category
FROM reviews
    GROUP BY category

where question1 can have a value of either 0, 1 or 2.

I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)

However, this becomes more cumbersome to write as the number of possible values for question1 increases. Is there a convenient way to write this query, possibly optimizing performance?

PS. My database is PostgreSQL

标签： sql postgresql count group-by aggregate-filter

2条回答

太酷不给撩

2楼-- · 2019-01-19 00:16

The "best" way (for me) is to write a query like:

SELECT
    category,
    question1,
    count(*)
FROM reviews
GROUP BY category, question1

Then I use this data to draw a table in application logic.

Other option is to use one JSON column for all grouping results. This will result in something like:

category1 | {"zero": 1, "one": 3, "two": 5}
category2 | {"one": 7, "two": 4}

and so on.

The query for this option you can build from the previous one with json_build_object and json_agg. The best thing for this option - you do not need to know number of possible question1 values ahead of time.

0人赞添加讨论(0) 举报

爷的心禁止访问

3楼-- · 2019-01-19 00:34

In Postgres 9.4 there is new, cleaner aggregate FILTER option:

SELECT category
     , count(*) FILTER (WHERE question1 = 0) AS zero
     , count(*) FILTER (WHERE question1 = 1) AS one
     , count(*) FILTER (WHERE question1 = 2) AS two
FROM   reviews
GROUP  BY 1;

Details for the new FILTER clause:

How can I simplify this game statistics query?

If you want it short:

SELECT category
     , count(question1 = 0 OR NULL) AS zero
     , count(question1 = 1 OR NULL) AS one
     , count(question1 = 2 OR NULL) AS two
FROM   reviews
GROUP  BY 1;

Overview over possible variants:

For absolute performance, is SUM faster or COUNT?

Proper crosstab query

crosstab() yields the best performance and is shorter for longer lists of options:

SELECT * FROM crosstab(
     'SELECT category, question1, count(*)::int AS ct
      FROM   reviews
      GROUP  BY 1, 2
      ORDER  BY 1, 2'
   , 'VALUES (0), (1), (2)'
   ) AS ct (category text, zero int, one int, two int);

Detailed explanation:

PostgreSQL Crosstab Query

0人赞添加讨论(0) 举报

SQL count if columns

Proper crosstab query

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间