I'm not sure if it is appropriate to call the problem as "previous valid value". The thing is like below:
I have a table "A":
create table A (
name varchar(16),
performanceDate date,
value int
);
Both name and PerformanceDate are primary key.
There is a process that will insert data for each user every day. So, the data will look something like below:
select * from A;
|------+-----------------+-------|
| name | performanceDate | value |
|------+-----------------+-------|
| Joe | 2012-05-18 | null |
| Joe | 2012-05-17 | 2 |
| Joe | 2012-05-16 | null |
| Joe | 2012-05-15 | null |
| Joe | 2012-05-14 | 3 |
|------+-----------------+-------|
Presently, I want to get a result set with preformanceDate between 2012-05-16 and 2012-05-18 If the value for current date is null, it should be replaced to the previous vaild value. For example, the value of 2012-05-16 is null, the first valid value before 2012-05-16 is 3
on 2012-05-14. So that, the result would like below:
|------+-----------------+-------|
| name | performanceDate | value |
|------+-----------------+-------|
| Joe | 2012-05-18 | 2 |
| Joe | 2012-05-17 | 2 |
| Joe | 2012-05-16 | 3 |
|------+-----------------+-------|
So far, I plan to insert the data into a temp table first (because the table "A" is read only for me), then update the value one by one. But this way is very slow. Do you have any idea about this?
You can use LATERAL JOIN, DB2 is not yet on SQLFiddle, following is its equivalent in SQL Server. LATERAL is equivalent to SQL Server's APPLY:
select x.name, x.performanceDate, coalesce(x.value, y.value) as value
from tbl x
outer apply
(
-- find nearest
select top 1 value
from tbl
where
x.value is null
and
(
name = x.name
and value is not null
and performanceDate < x.performanceDate
)
order by performanceDate desc
) as y
order by x.name, x.performanceDate desc
Data:
| NAME | PERFORMANCEDATE | VALUE |
|------|----------------------------|--------|
| Joe | May, 18 2012 08:00:00-0700 | (null) |
| Joe | May, 17 2012 08:00:00-0700 | 2 |
| Joe | May, 16 2012 08:00:00-0700 | (null) |
| Joe | May, 15 2012 08:00:00-0700 | (null) |
| Joe | May, 14 2012 08:00:00-0700 | 3 |
Output:
| NAME | PERFORMANCEDATE | VALUE |
|------|----------------------------|-------|
| Joe | May, 18 2012 08:00:00-0700 | 2 |
| Joe | May, 17 2012 08:00:00-0700 | 2 |
| Joe | May, 16 2012 08:00:00-0700 | 3 |
| Joe | May, 15 2012 08:00:00-0700 | 3 |
| Joe | May, 14 2012 08:00:00-0700 | 3 |
Live test: http://www.sqlfiddle.com/#!6/e0158/8
Base on IBM documentation.. http://publib.boulder.ibm.com/infocenter/iseries/v7r1m0/index.jsp?topic=%2Fdb2%2Frbafztabref.htm
..,I wish this is that:
select x.name, x.performanceDate, coalesce(x.value, y.value) as value
from tbl x,
lateral
(
-- find nearest
select top 1 value
from tbl
where
x.value is null
and
(
name = x.name
and value is not null
and performanceDate < x.performanceDate
)
order by performanceDate desc
fetch first 1 rows only
) as y
order by x.name, x.performanceDate desc
Interesting metrics on CROSS APPY/OUTER APPLY: http://explainextended.com/2009/07/16/inner-join-vs-cross-apply/
Another example of OUTER APPLY's use: http://www.ienablemuch.com/2012/04/outer-apply-walkthrough.html
with tmp1(name, performanceDate, value, cid) as (
select
name,
performanceDate,
value,
count(value) over (partition by name
order by performanceDate,
value nulls last) as cid from A),
tmp2(name, performanceDate, value) as (
select
name,
performanceDate,
first_value(value) over (partition by name, cid
order by performanceDate,
value nulls last from tmp1)
select * from tmp2;
with tmp1 (name, performanceDate_Begin, performanceDate_End, value) as (
select
name,
performanceDate as performanceDate_Begin,
lead(performanceDate) over (partition by name
order by performanceDate,
value nulls last) as performanceDate_End,value from A where value is not null),
tmp2 (name, performanceDate, value) as (
select
A.name,
A.performanceDate,
tmp1.value
from A
left join tmp1 on A.name = B.name
and A.performanceDate >= tmp1.performanceDate_Begin
and A.performanceDate < coalesce(tmp1.performanceDate_End,date '9999-12-31'))
select * from tmp2