If I have a jobs table like:
|id|created_at |status |
----------------------------
|1 |01-01-2015 |error |
|2 |01-01-2015 |complete |
|3 |01-01-2015 |error |
|4 |01-02-2015 |complete |
|5 |01-02-2015 |complete |
|6 |01-03-2015 |error |
|7 |01-03-2015 |on hold |
|8 |01-03-2015 |complete |
I want a query that will group them by date and count the occurrence of each status and the total status for that date.
SELECT created_at status, count(status), created_at
FROM jobs
GROUP BY created_at, status;
Which gives me
|created_at |status |count|
-------------------------------
|01-01-2015 |error |2
|01-01-2015 |complete |1
|01-02-2015 |complete |2
|01-03-2015 |error |1
|01-03-2015 |on hold |1
|01-03-2015 |complete |1
I would like to now condense this down to a single row per created_at
unique date with some sort of multi column layout for each status
. One constraint is that status
is any one of 5 possible words but each date might not have one of every status. Also I would like a total of all statuses for each day. So desired results would look like:
|date |total |errors|completed|on_hold|
----------------------------------------------
|01-01-2015 |3 |2 |1 |null
|01-02-2015 |2 |null |2 |null
|01-03-2015 |3 |1 |1 |1
the columns could be built dynamically from something like
SELECT DISTINCT status FROM jobs;
with a null result for any day that doesn't contain any of that type of status. I am no SQL expert but am trying to do this in a DB view so that I don't have to bog down doing multiple queries in Rails.
I am using Postresql but would like to try to keep it straight SQL. I have tried to understand aggregate function enough to use some other tools but not succeeding.
The following should work in any RDBMS:
The query uses conditional aggregation so as to pivot grouped data. It assumes that
status
values are known before-hand. If you have additional cases ofstatus
values, just add the correspondingsum(case ...
expression.Demo here
An actual crosstab query would look like this:
Should perform very well.
Basics:
The above does not yet include the total per date.
Postgres 9.5 introduces the ROLLUP clause, which is perfect for the case:
Up to Postgres 9.4, use this query instead:
Related:
If you want to stick to a simple query, this is a bit shorter:
count(status)
for the total per date is error-prone, because it would not count rows with NULL values instatus
. Usecount(*)
instead, which is also shorter and a bit faster.Here is a list of techniques:
In Postgres 9.4+ use the new aggregate
FILTER
clause, like @a_horse mentioned:Details: