This is a part of solving a more complex problem.
There is a table with data:
+------------+------+----------+-----------+
| date | data | data_max | data_diff |
+------------+------+----------+-----------+
| 2017-01-02 | 2 | 2 | NULL |
| 2017-01-03 | 4 | 4 | NULL |
| 2017-01-04 | 1 | 4 | -3 |
| 2017-01-05 | 3 | 4 | -1 |
| 2017-01-06 | 1 | 4 | -3 |
| 2017-01-07 | 4 | 4 | NULL |
| 2017-01-08 | 5 | 5 | NULL |
| 2017-01-09 | -2 | 5 | -7 |
| 2017-01-10 | 0 | 5 | -5 |
| 2017-01-11 | -5 | 5 | -10 |
| 2017-01-12 | 6 | 6 | NULL |
| 2017-01-13 | 4 | 6 | -2 |
+------------+------+----------+-----------+
I want to calculate Min and Max values of data_diff
but separately for each data subset. Each subset of data starts with NULL (but the last one may not end with NULL but with the data) I need also start and end date of each data subset that I can later use for calculating Min, Max values. I would like to get date ranges:
+----------------+--------------+
| diff_date_from | diff_date_to |
+----------------+--------------+
| 2017-01-04 | 2017-01-06 |
| 2017-01-09 | 2017-01-11 |
| 2017-01-13 | 2017-01-13 |
+----------------+--------------+
If you would like to get the example data here's a query:
CREATE TABLE IF NOT EXISTS `test`
(
`date_time` DATETIME UNIQUE NOT NULL,
`data` INT NOT NULL
)
ENGINE InnoDB;
INSERT INTO `test` VALUES
('2017-01-02', 2),
('2017-01-03', 4),
('2017-01-04', 1),
('2017-01-05', 3),
('2017-01-06', 1),
('2017-01-07', 4),
('2017-01-08', 5),
('2017-01-09', -2),
('2017-01-10', 0),
('2017-01-11', -5),
('2017-01-12', 6),
('2017-01-13', 4)
;
SELECT
DATE(`date_time`) AS `date`,
`data`,
`data_max`,
IF(`data` < `data_max`, - (`data_max` - `data`), NULL)
AS `data_diff`
FROM
(
SELECT
`date_time`,
`data`,
MAX(`data`) OVER (ORDER BY `date_time` ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS `data_max`
FROM
`test`
) t
;
Is it possible to write a single query that will provide date ranges as above? Or there must be a procedure or some sort of trick applied?
Maybe a window function with OVER could help but I'm not aware how to specify its window boundary between current row that is not NULL and preceding rows starting from a row preceded by NULL. Is this feasible at all?
There is RANGE
operator for setting window boundary Documentation
that looks promising:
PRECEDING: For ROWS, the bound is expr rows before the current row. For RANGE, the bound is the rows with values equal to the current row value minus expr; if the current row value is NULL, the bound is the peers of the row.
and another part:
ORDER BY X ASC RANGE BETWEEN 10 PRECEDING AND 10 FOLLOWING
The frame starts at NULL and stops at NULL, thus includes only rows with value NULL.
But I don't get the point of inlcuding only rows with null
.
Perhaps for the date range 2017-01-02
to 2017-01-03
but for 2017-01-03
to 2017-01-07
how come?