DB2 - Is there a simple way to get “previous valid

2019-07-27 11:10发布

I'm not sure if it is appropriate to call the problem as "previous valid value". The thing is like below:

I have a table "A":

create table A (
  name varchar(16),
  performanceDate date,
  value int
);

Both name and PerformanceDate are primary key.

There is a process that will insert data for each user every day. So, the data will look something like below:

select * from A;
|------+-----------------+-------|
| name | performanceDate | value |
|------+-----------------+-------|
| Joe  |      2012-05-18 | null  |
| Joe  |      2012-05-17 | 2     |
| Joe  |      2012-05-16 | null  |
| Joe  |      2012-05-15 | null  |
| Joe  |      2012-05-14 | 3     |
|------+-----------------+-------|

Presently, I want to get a result set with preformanceDate between 2012-05-16 and 2012-05-18 If the value for current date is null, it should be replaced to the previous vaild value. For example, the value of 2012-05-16 is null, the first valid value before 2012-05-16 is 3 on 2012-05-14. So that, the result would like below:

|------+-----------------+-------|
| name | performanceDate | value |
|------+-----------------+-------|
| Joe  |      2012-05-18 |     2 |
| Joe  |      2012-05-17 |     2 |
| Joe  |      2012-05-16 |     3 |
|------+-----------------+-------|

So far, I plan to insert the data into a temp table first (because the table "A" is read only for me), then update the value one by one. But this way is very slow. Do you have any idea about this?

2条回答
何必那么认真
2楼-- · 2019-07-27 11:29

You can use LATERAL JOIN, DB2 is not yet on SQLFiddle, following is its equivalent in SQL Server. LATERAL is equivalent to SQL Server's APPLY:

select x.name, x.performanceDate, coalesce(x.value, y.value) as value
from tbl x

outer apply 
( 
  -- find nearest

  select top 1 value
  from tbl
  where 

    x.value is null

    and 

    (
       name = x.name
       and value is not null 
       and performanceDate < x.performanceDate 
    )

  order by performanceDate desc

) as y
order by x.name, x.performanceDate desc

Data:

| NAME |            PERFORMANCEDATE |  VALUE |
|------|----------------------------|--------|
|  Joe | May, 18 2012 08:00:00-0700 | (null) |
|  Joe | May, 17 2012 08:00:00-0700 |      2 |
|  Joe | May, 16 2012 08:00:00-0700 | (null) |
|  Joe | May, 15 2012 08:00:00-0700 | (null) |
|  Joe | May, 14 2012 08:00:00-0700 |      3 |

Output:

| NAME |            PERFORMANCEDATE | VALUE |
|------|----------------------------|-------|
|  Joe | May, 18 2012 08:00:00-0700 |     2 |
|  Joe | May, 17 2012 08:00:00-0700 |     2 |
|  Joe | May, 16 2012 08:00:00-0700 |     3 |
|  Joe | May, 15 2012 08:00:00-0700 |     3 |
|  Joe | May, 14 2012 08:00:00-0700 |     3 |

Live test: http://www.sqlfiddle.com/#!6/e0158/8


Base on IBM documentation.. http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Fdb2%2Frbafztabref.htm

..,I wish this is that:

select x.name, x.performanceDate, coalesce(x.value, y.value) as value
from tbl x,

lateral 
( 
  -- find nearest

  select top 1 value
  from tbl
  where 

    x.value is null

    and 

    (
       name = x.name
       and value is not null 
       and performanceDate < x.performanceDate 
    )

  order by performanceDate desc
  fetch first 1 rows only

) as y
order by x.name, x.performanceDate desc

Interesting metrics on CROSS APPY/OUTER APPLY: http://explainextended.com/2009/07/16/inner-join-vs-cross-apply/

Another example of OUTER APPLY's use: http://www.ienablemuch.com/2012/04/outer-apply-walkthrough.html

查看更多
一夜七次
3楼-- · 2019-07-27 11:43
with tmp1(name, performanceDate, value, cid) as (
select 
   name,
   performanceDate,
   value,
   count(value) over (partition by name 
                          order by performanceDate,
                                   value nulls last) as cid from A),
tmp2(name, performanceDate, value) as (
select 
   name,
   performanceDate,
   first_value(value) over (partition by name, cid 
                                order by performanceDate, 
                                         value nulls last from tmp1)
select * from tmp2;

with tmp1 (name, performanceDate_Begin, performanceDate_End, value) as (
select 
   name,
   performanceDate as performanceDate_Begin,
   lead(performanceDate) over (partition by name 
                                   order by performanceDate,
                                            value nulls last) as performanceDate_End,value from A where value is not null),
tmp2 (name, performanceDate, value) as (
select 
   A.name,
   A.performanceDate,
   tmp1.value 
from A 
left join tmp1 on A.name = B.name 
              and A.performanceDate >= tmp1.performanceDate_Begin 
              and A.performanceDate < coalesce(tmp1.performanceDate_End,date '9999-12-31'))
select * from tmp2
查看更多
登录 后发表回答