I need to produce the column "required" in the following table using SQL without using loops and correlated sub queries. Is this possible in SQL 2008?
Date Customer Value Required Rule
20100101 1 12 12
20100101 2 0 If no value assign 0
20100101 3 32 32
20100101 4 42 42
20100101 5 15 15
20100102 1 12 Take last known value
20100102 2 0 Take last known value
20100102 3 39 39
20100102 4 42 Take last known value
20100102 5 16 16
20100103 1 13 13
20100103 2 24 24
20100103 3 39 Take last known value
20100103 4 42 Take last known value
20100103 5 21 21
20100104 1 14 14
20100104 2 24 Take last known value
20100104 3 39 Take last known value
20100104 4 65 65
20100104 5 23 23
Basically I am filling empty "Value" cells with last know value for that customer. Remember the last row may not have a valid value, so you will have to pick it from the row before that with a valid value.
Faiz,
how about the following query, it does what you want as far as I understand it. The comments explain each step. Take a look at CTEs on Books Online. This example could even be changed to use the new MERGE command for SQL 2008.
/* Test Data & Table */
DECLARE @Customers TABLE
(Dates datetime,
Customer integer,
Value integer)
INSERT INTO @Customers
VALUES ('20100101', 1, 12),
('20100101', 2, NULL),
('20100101', 3, 32),
('20100101', 4, 42),
('20100101', 5, 15),
('20100102', 1, NULL),
('20100102', 2, NULL),
('20100102', 3, 39),
('20100102', 4, NULL),
('20100102', 5, 16),
('20100103', 1, 13),
('20100103', 2, 24),
('20100103', 3, NULL),
('20100103', 4, NULL),
('20100103', 5, 21),
('20100104', 1, 14),
('20100104', 2, NULL),
('20100104', 3, NULL),
('20100104', 4, 65),
('20100104', 5, 23) ;
/* CustCTE - This gives us a RowNum to allow us to build the recursive CTE CleanCust */
WITH CustCTE
AS (SELECT Customer,
Value,
Dates,
ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY Dates) RowNum
FROM @Customers),
/* CleanCust - A recursive CTE. This runs down the list of values for each customer, checking the Value column, if it is null it gets the previous non NULL value.*/
CleanCust
AS (SELECT Customer,
ISNULL(Value, 0) Value, /* Ensure we start with no NULL values for each customer */
Dates,
RowNum
FROM CustCte cur
WHERE RowNum = 1
UNION ALL
SELECT Curr.Customer,
ISNULL(Curr.Value, prev.Value) Value,
Curr.Dates,
Curr.RowNum
FROM CustCte curr
INNER JOIN CleanCust prev ON curr.Customer = prev.Customer
AND curr.RowNum = prev.RowNum + 1)
/* Update the base table using the result set from the recursive CTE */
UPDATE trg
SET Value = src.Value
FROM @Customers trg
INNER JOIN CleanCust src ON trg.Customer = src.Customer
AND trg.Dates = src.Dates
/* Display the results */
SELECT * FROM @Customers
This is the "Last non-null puzzle," and here's one of several elegant solutions:
If your "sparse" table is SparseTable
with columns Date, Customer, Value then:
with C as
(select *,
max(case when Value is not null then [Date] end)
over (partition by Customer order by [Date] rows unbounded preceding) as grp
from SparseTable
)
insert into FullTable
select *,
max(Value) over (partition by Customer, grp order by [Date] rows unbounded preceding) as Required
from C
Where Value
couldn't be filled forward it will still be NULL
, so you can then
update FullTable set Required = 0 where Required is null
I need to produce the column
"required" in the following table
using SQL without using loops and
correlated sub queries. Is this
possible in SQL 2008?
Impossible. Point. Not possibly on ANY SQL based server, including oracle.
The main problem here is that you rule out loops AND correlated subqueries and any way to retrieve the value at query time will ultimately use another query to find the valid value (actually one per field). This is how SQL works. Yes, you could hide them in a custom scalar function, but still they would contain a logical sub query.
I'm not sure if following counts considering your constraints but it gets the job done.
Test data
DECLARE @Customers TABLE (Date DATETIME, Customer INTEGER, Value INTEGER)
INSERT INTO @Customers VALUES ('20100101', 1, 12 )
INSERT INTO @Customers VALUES ('20100101', 2, NULL)
INSERT INTO @Customers VALUES ('20100101', 3, 32 )
INSERT INTO @Customers VALUES ('20100101', 4, 42 )
INSERT INTO @Customers VALUES ('20100101', 5, 15 )
INSERT INTO @Customers VALUES ('20100102', 1, NULL)
INSERT INTO @Customers VALUES ('20100102', 2, NULL)
INSERT INTO @Customers VALUES ('20100102', 3, 39 )
INSERT INTO @Customers VALUES ('20100102', 4, NULL)
INSERT INTO @Customers VALUES ('20100102', 5, 16 )
INSERT INTO @Customers VALUES ('20100103', 1, 13 )
INSERT INTO @Customers VALUES ('20100103', 2, 24 )
INSERT INTO @Customers VALUES ('20100103', 3, NULL)
INSERT INTO @Customers VALUES ('20100103', 4, NULL)
INSERT INTO @Customers VALUES ('20100103', 5, 21 )
INSERT INTO @Customers VALUES ('20100104', 1, 14 )
INSERT INTO @Customers VALUES ('20100104', 2, NULL)
INSERT INTO @Customers VALUES ('20100104', 3, NULL)
INSERT INTO @Customers VALUES ('20100104', 4, 65 )
INSERT INTO @Customers VALUES ('20100104', 5, 23 )
Query
SELECT c.Date
, c.Customer
, Value = COALESCE(c.Value, cprevious.Value, 0)
FROM @Customers c
INNER JOIN (
SELECT c.Date
, c.Customer
, MaxDate = MAX(cdates.Date)
FROM @Customers c
LEFT OUTER JOIN (
SELECT Date
, Customer
FROM @Customers
) cdates ON cdates.Date < c.Date AND cdates.Customer = c.Customer
GROUP BY
c.Date, c.Customer
) cmax ON cmax.Date = c.Date AND cmax.Customer = c.Customer
LEFT OUTER JOIN @Customers cprevious ON cprevious.Date = cmax.MaxDate AND cprevious.Customer = cmax.Customer
ORDER BY
1, 2, 3
Update statement
UPDATE @Customers
SET Value = c2.Value
OUTPUT Inserted.*
FROM @Customers c
INNER JOIN (
SELECT c.Date
, c.Customer
, Value = COALESCE(c.Value, cprevious.Value, 0)
FROM @Customers c
INNER JOIN (
SELECT c.Date
, c.Customer
, MaxDate = MAX(cdates.Date)
FROM @Customers c
LEFT OUTER JOIN (
SELECT Date
, Customer
FROM @Customers
) cdates ON cdates.Date < c.Date AND cdates.Customer = c.Customer
GROUP BY
c.Date, c.Customer
) cmax ON cmax.Date = c.Date AND cmax.Customer = c.Customer
LEFT OUTER JOIN @Customers cprevious ON cprevious.Date = cmax.MaxDate AND cprevious.Customer = cmax.Customer
) c2 ON c2.Date = c.Date
AND c2.Customer = c.Customer
How about a left outer join on the same table where the date is less than the current one and the value is non-empty, ordered by date desc (limit 1), returning zero when null ? (No server available to test at the moment). Unless this counts as a sub-query...