Cumulative sum of values by month, filling in for

2019-05-05 15:52发布

问题:

I have this data table and I'm wondering if is possible create a query that get a cumulative sum by month considering all months until the current month.

date_added                    | qty
------------------------------------
2015-08-04 22:28:24.633784-03 | 1
2015-05-20 20:22:29.458541-03 | 1
2015-04-08 14:16:09.844229-03 | 1
2015-04-07 23:10:42.325081-03 | 1
2015-07-06 18:50:30.164932-03 | 1
2015-08-22 15:01:54.03697-03  | 1
2015-08-06 18:25:07.57763-03  | 1
2015-04-07 23:12:20.850783-03 | 1
2015-07-23 17:45:29.456034-03 | 1
2015-04-28 20:12:48.110922-03 | 1
2015-04-28 13:26:04.770365-03 | 1
2015-05-19 13:30:08.186289-03 | 1
2015-08-06 18:26:46.448608-03 | 1
2015-08-27 16:43:06.561005-03 | 1
2015-08-07 12:15:29.242067-03 | 1

I need a result like that:

Jan|0
Feb|0
Mar|0
Apr|5
May|7
Jun|7
Jul|9
Aug|15

回答1:

This is very similar to other questions, but the best query is still tricky.

Basic query to get the running sum quickly:

SELECT to_char(date_trunc('month', date_added), 'Mon YYYY') AS mon_text
     , sum(sum(qty)) OVER (ORDER BY date_trunc('month', date_added)) AS running_sum
FROM   tbl
GROUP  BY date_trunc('month', date_added)
ORDER  BY date_trunc('month', date_added);

The tricky part is to fill in for missing months:

WITH cte AS (
   SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
   FROM   tbl
   GROUP  BY 1
   )
SELECT to_char(mon, 'Mon YYYY') AS mon_text
     , sum(c.mon_sum) OVER (ORDER BY mon) AS running_sum
FROM  (SELECT min(mon) AS min_mon FROM cte) init
     , generate_series(init.min_mon, now(), interval '1 month') mon
LEFT   JOIN cte c USING (mon)
ORDER  BY mon;

The implicit CROSS JOIN LATERAL requires Postgres 9.3+. This starts with the first month in the table.
To start with a given month:

WITH cte AS (
   SELECT date_trunc('month', date_added) AS mon, sum(qty) AS mon_sum
   FROM   tbl
   GROUP  BY 1
   )
SELECT to_char(mon, 'Mon YYYY') AS mon_text
     , COALESCE(sum(c.mon_sum) OVER (ORDER BY mon), 0) AS running_sum
FROM   generate_series('2015-01-01'::date, now(), interval '1 month') mon
LEFT   JOIN cte c USING (mon)
ORDER  BY mon;

SQL Fiddle.

Keeping months from different years apart. You did not ask for that, but you'll most likely want it.

Note that the "month" to some degree depends on the time zone setting of the current session! Details:

  • Ignoring timezones altogether in Rails and PostgreSQL

Related:

  • Calculating Cumulative Sum in PostgreSQL
  • PostgreSQL: running count of rows for a query 'by minute'
  • Postgres window function and group by exception