I'm looking to aggregate data by the end date of a dataset with some leading period rather than the start. For example, I want to query a table and return the count of matching results 30 days PRIOR to the end date of the date shown in the results. The original table would contain ONLY the date a sale was made (timestamp). Example:
sales_timestamp
------------------
2015-08-05 12:00:00
2015-08-06 13:00:00
2015-08-25 12:31:00
2015-08-26 01:02:00
2015-08-27 02:03:00
2015-08-29 04:23:00
2015-09-01 12:00:00
2015-09-02 12:00:00
2015-09-08 00:00:00
An example of the resulting query output would be:
date_period | count_of_sales
--------------------------------
2015-08-24 | 2
2015-08-31 | 6
2015-09-07 | 6
in which the date_period of 2015-09-07 would imply the company sold 6 items in the 30 days ENDING on 9/7/2015 (and starting ~8/7/2015 if a true 30 day period).
I've been toying with variations of the date_trunc()
function but can't seem to get the truncation to apply on the end date rather than grouping by the start.
This data would be housed on PostgreSQL 9.1.
This query does all you ask for:
SQL Fiddle.
Explanation
generate_series()
)LEFT
JOIN
to the aggregated counts per day. The LEFT guarantees one row per day, which allows us to use window functions based on the row count.Use
sum()
as window aggregate function with a custom frame of 30 days preceding. (You may want to use 29 instead, it's unclear how you count.)Join the result to actual days you want in the result. (2nd
generate_series()
with one day per week).Be aware that the definition of "day" is derived from the current time zone setting of your session if you work with
timestamptz
. Results can be different in different time zones. Does not apply for justtimestamp
, which does not depend on the current time zone. Basics:Related answer with explanation for the window function with custom frame definition: