I want, in a request, to fill all Null values by the last known value. When it's in a table and not in a request, it's easy:
If I define and fill my table as follows:
CREATE TABLE test_fill_null (
date INTEGER,
value INTEGER
);
INSERT INTO test_fill_null VALUES
(1,2),
(2, NULL),
(3, 45),
(4,NULL),
(5, null);
SELECT * FROM test_fill_null ;
date | value
------+-------
1 | 2
2 |
3 | 45
4 |
5 |
Then I just have to fill like that:
UPDATE test_fill_null t1
SET value = (
SELECT t2.value
FROM test_fill_null t2
WHERE t2.date <= t1.date AND value IS NOT NULL
ORDER BY t2.date DESC
LIMIT 1
);
SELECT * FROM test_fill_null;
date | value
------+-------
1 | 2
2 | 2
3 | 45
4 | 45
5 | 45
But now, I'm in a request, like this one:
WITH
pre_table AS(
SELECT
id1,
id2,
tms,
CASE
WHEN tms - lag(tms) over w < interval '5 minutes' THEN NULL
ELSE id2
END as group_id
FROM
table0
window w as (partition by id1 order by tms)
)
Where the group_id is set to id2 when the previous point is distant from more than 5 minutes, null otherwise. By doing so, I want to end up with group of points that follow each other by less than 5 minutes, and gaps of more than 5 minutes between each groups.
Then I don't know how to proceed. I tried:
SELECT distinct on (id1, id2)
t0.id1,
t0.id2,
t0.tms,
t1.group_id
FROM
pre_table t0
LEFT JOIN (
select
id1,
tms,
group_id
from pre_table t2
where t2.group_id is not null
order by tms desc
) t1
ON
t1.tms <= t0.tms AND
t1.id1 = t0.id1
WHERE
t0.id1 IS NOT NULL
ORDER BY
id1,
id2,
t1.tms DESC
But in the final result I have some group with two consecutive points which are distant from more than 5 minutes. Their should be two different groups in this case.
While editing my question I found a solution. It's pretty low though, much lower than my example within a table. Any suggestion to improve it ?
So as I said, a select within a select
A "select within a select" is more commonly called "subselect" or "subquery" In your particular case it's a correlated subquery.
LATERAL
joins (new in postgres 9.3) can largely replace correlated subqueries with more flexible solutions:I don't think you need either here.
For your first case this query is probably faster and simpler, though:
count()
only counts non-null values, sogrp
is incremented with every non-nullvalue
, thereby forming groups as desired. It's trivial to pick the one non-nullvalue
pergrp
in the outerSELECT
.For your second case, I'll assume the initial order of rows is determined by
(id1, id2, tms)
as indicated by one of your queries.Adapt to your actual order. One of these might cover it:
SQL Fiddle with an extended example.
Related: