What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column.
I have seen
SELECT
sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO,
sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE,
sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO,
category
FROM reviews
GROUP BY category
where question1 can have a value of either 0, 1 or 2.
I have also seen a version of that using count(CASE WHEN question1 = 0 THEN 1)
However, this becomes more cumbersome to write as the number of possible values for question1 increases. Is there a convenient way to write this query, possibly optimizing performance?
PS. My database is PostgreSQL
The "best" way (for me) is to write a query like:
Then I use this data to draw a table in application logic.
Other option is to use one JSON column for all grouping results. This will result in something like:
and so on.
The query for this option you can build from the previous one with
json_build_object
andjson_agg
. The best thing for this option - you do not need to know number of possiblequestion1
values ahead of time.In Postgres 9.4 there is new, cleaner aggregate
FILTER
option:Details for the new
FILTER
clause:If you want it short:
Overview over possible variants:
Proper crosstab query
crosstab()
yields the best performance and is shorter for longer lists of options:Detailed explanation: