I am creating a report in Postgres 9.3. This is my SQL Fiddle.
Basically I have two tables, responses
and questions
, the structure is:
responses
->id
->question_id
->response
questions
->id
->question
->costperlead
for the column response
there can only be 3 values, Yes/No/Possbily
,
and my report should have the columns:
question_id
, # of Yes Responses
, # of No Responses
, # of Possbily Responses
, Revenue
Then:
# of Yes Responses - count of all Yes values in the response column
# of No Responses - count of all No values in the response column
# of Possbily Responses - count of all 'Possbily' values in the response column
Revenue is the costperlead
* (Number of Yes Responses + Number of Possibly Responses).
I don't know how to construct the query, I'm new plus I came from MySQL so some things are different for postgres. In my SQL Fiddle sample most responses are Yes and Null, it's ok eventually, there will be Possibly and No.
So far I have only:
SELECT a.question_id
FROM responses a
INNER JOIN questions b ON a.question_id = b.id
WHERE a.created_at = '2015-07-17'
GROUP BY a.question_id;
Since the only predicate filters responses, it would be most efficient to aggregate responses first, then join to questions:
SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM (
SELECT question_id
, count(*) FILTER (WHERE response = 'Yes') AS ct_yes
, count(*) FILTER (WHERE response = 'No') AS ct_no
, count(*) FILTER (WHERE response = 'Possibly') AS ct_maybe
FROM responses
WHERE created_at = '2015-07-17'
GROUP BY 1
) r
JOIN questions q ON q.id = r.question_id;
This uses the new aggregate Filter clause of Postgres 9.4:
- How can I simplify this game statistics query?
BTW, I would consider implementing response
as boolean
type with true
/false
/null
.
For Postgres 9.3:
SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM (
SELECT question_id
, count(response = 'Yes' OR NULL) AS ct_yes
, count(response = 'No' OR NULL) AS ct_no
, count(response = 'Possibly' OR NULL) AS ct_maybe
FROM responses
WHERE created_at = '2015-07-17'
GROUP BY 1
) r
JOIN questions q ON q.id = r.question_id;
SQL Fiddle (building on yours).
Here is a comparison of techniques, before the aggregate FILTER
clause existed:
- For absolute performance, is SUM faster or COUNT?
You should try:
SELECT a.question_id,
SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) AS NumsOfYes,
SUM(CASE WHEN a.response = 'No' THEN 1 ELSE 0 END) AS NumsOfNo,
SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS NumOfPossibly,
costperlead * SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) + SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS revenue
FROM responses a
INNER JOIN questions b ON a.question_id = b.id
GROUP BY a.question_id, b.costperlead