Suppose raw data is:
Timestamp High Low Volume
10:24.22345 100 99 10
10:24.23345 110 97 20
10:24.33455 97 89 40
10:25.33455 60 40 50
10:25.93455 40 20 60
With a sample time of 1 second, the output data should be as following (they are grouped by second):
Timestamp Open Close High Low Volume
10:24 82 83 110 89 70
10:25 50 40 60 20 110
Open
means the price of the earliest data in the group
Close
means the price of the lastest data in the group
Volume
means the sum(Volume) in the group
The sampling unit from varying from 1 second, 5 sec, 1 minute, 1 hour, 1 day, ...
Now I can get the High, Low, Volume by the following SQL:
SELECT date_trunc(\'#{interval}\', ticktime) AS ticktime_stamp,
max(bid_price) as high,
min(bid_price) as low,
sum(bid_volume) as volume,
max(product_type) as product_type
FROM czces
WHERE ticktime >= \'#{begin_time}\'::timestamp
AND ticktime < \'#{end_time}\'::timestamp
AND product_type =\'#{product_type}\'
GROUP BY 1
ORDER BY ticktime_stamp ASC
But how to get the open
, close
value in each group based on the above query?
You could use window functions combined with DISTINCT ON
for that:
SELECT DISTINCT ON (1)
date_trunc('#{interval}', ticktime) AS ticktime_stamp
, max(bid_price) OVER w AS high
, min(bid_price) OVER w AS low
, sum(bid_volume) OVER w AS volume
, max(product_type) OVER w AS product_type
, min(product_type) OVER w AS product_type
, first_value(bid_price) OVER w AS open
, last_value(bid_price) OVER w AS close
FROM czces
WHERE ticktime >= '#{begin_time}'::timestamp
AND ticktime < '#{end_time}'::timestamp
AND product_type ='#{product_type}'
WINDOW w AS (PARTITION BY date_trunc('#{interval}', ticktime) ORDER BY ticktime
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
ORDER BY 1;
Explanation for the custom window frame:
- How to use a ring data structure in window functions
- PostgreSQL window function: partition by comparison
Explanation for DISTINCT ON
:
- Select first row in each GROUP BY group?
The query you are looking for is most likely this, but without actual testing data it's hard to be certain.
SELECT
min(date_trunc('second', "Timestamp")) OVER minutes "Timestamp",
first_value("Price") OVER minutes "Open",
last_value("Price") OVER minutes "Close",
max("High") OVER minutes "High",
min("Low") OVER minutes "Low",
sum("Volume") OVER minutes "Volume"
FROM czces
WINDOW minutes AS (
PARTITION BY date_trunc('second', "Timestamp")
ORDER BY "Timestamp"
);