PostgreSQL use value from previous row if missing

2019-02-18 19:45发布

I have a following query:

WITH t as (
  SELECT date_trunc('hour', time_series) as trunc 
  FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00', 
                       '1 hour') as time_series
  GROUP BY trunc
  ORDER BY trunc
)
SELECT DISTINCT ON(trunc) trunc, id
FROM t
LEFT JOIN (
   SELECT id, created, date_trunc('hour', created) as trunc_u
   FROM event
   ORDER BY created DESC
) u
ON trunc = trunc_u

which yields the following result:

"2013-02-27 22:00:00";
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";
"2013-02-28 02:00:00";

Table event has id, created and some other columns, but only those are relevant here. The query above gives me id of last event generated per given trunc time period (thanks to DISTINCT ON I get a nice aggregation per period).

Now, this query yields NULL if no events happened in given time period. I would like it to return the previous available id, even if it is from different time period. I.e.:

"2013-02-27 22:00:00";0
"2013-02-27 23:00:00";2
"2013-02-28 00:00:00";5
"2013-02-28 01:00:00";5
"2013-02-28 02:00:00";5

I am sure I am missing some easy way to accomplish this. Any advice?

2条回答
等我变得足够好
2楼-- · 2019-02-18 19:54

You ca mix a self join and windows functions

Simplifying I take this table with this sample values:

create table t ( a int, b int);    
insert into t values 
( 1, 1),
( 2, Null),
( 3, Null),
( 4, 2 ),
( 5, Null),
( 6, Null);

In your query a is trunc_u and b is your id. The query is:

with cte as (    
    select 
      t1.a, 
      coalesce( t1.b, t2.b, 0) as b,
      rank() OVER 
       (PARTITION BY t1.a ORDER BY t2.a DESC) as pos
    from t t1 
    left outer join t t2
      on t2.b is not null and
         t2.a < t1.a    
)
select a, b
from cte
where pos = 1;

And results:

| A | B |
---------
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
| 6 | 2 |
查看更多
小情绪 Triste *
3楼-- · 2019-02-18 20:09

Try:

WITH t as (
  SELECT time_series as trunc 
    FROM generate_series('2013-02-27 22:00'::timestamp, '2013-02-28 2:00', 
                         '1 hour') as time_series
)
SELECT DISTINCT ON(t.trunc) t.trunc, e.id
  FROM t
  JOIN event e
    ON e.created < t.trunc 
 ORDER BY t.trunc, e.created DESC

If it is too slow - tell me. I will give you a faster query.

查看更多
登录 后发表回答