I have a table like this in PostgreSQL. I want to perform aggregation functions like mean and max for every 16 records based on ID (which is primary key). For example I have to calculate mean value for first 16 records and second 16 records and so on.
+-----+-------------
| ID | rainfall |
+-----+----------- |
| 1 | 110.2 |
| 2 | 56.6 |
| 3 | 65.6 |
| 4 | 75.9 |
+-----+------------
The 1st approach that comes to mind is to use row_number()
to annotate the table, then group by blocks of 16 rows.
SELECT min(id) as first_id, max(id) AS last_id, avg(rainfall) AS avg_this_16
FROM (
SELECT id, rainfall, row_number() OVER (order by id) AS n
FROM the_table
) x(id,rainfall,n)
GROUP BY n/16
ORDER BY n/16;
Note that this won't necessarily include 16 samples for the last group.
Alternately you can calculate a running average by using avg()
as a window function:
SELECT id, avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING)
FROM the_table;
... possibly annotating that with the row number and selecting the ones you want:
SELECT id AS greatest_id_in_group, avg_last_16_inclusive FROM (
SELECT
id,
avg(rainfall) OVER (ORDER BY id ROWS 15 PRECEDING) AS avg_last_16_inclusive,
row_number() OVER (ORDER BY id) AS n
FROM the_table
) x WHERE n % 16 = 0;
This will disregard the last n<16 samples, not returning a row for them.
Note that I'm assuming the IDs aren't guaranteed to be contiguous. If they are gap-less, you can just group by id/16
and avoid the window function.
late answer, but anyway for reference
since ID
was said to be continuos and gap-less, then this would result pretty straightforward
SELECT avg(rainfall),string_agg(id::text, ',')
FROM the_table
GROUP BY (id - 1) / 16;
notice the (id - 1)
to get the grouping from zero to 15, otherwise first group may dephase
PS: @Craig Ringer gave a hint by the end of his answer, but didn't post is as code
Note that I'm assuming the IDs aren't guaranteed to be contiguous. If they are gap-less, you can just group by id/16 and avoid the window function.