Here is a table with sample data:
DECLARE @TestTable TABLE (
ItemID INT,
A INT,
B INT,
Month INT)
INSERT INTO @TestTable VALUES (1234, 5, 9, 1)
INSERT INTO @TestTable VALUES (1234, 6, 9, 2)
INSERT INTO @TestTable VALUES (4321, 5, 11, 1)
INSERT INTO @TestTable VALUES (4321, 12, 11, 2)
INSERT INTO @TestTable VALUES (1324, 14, 6, 1)
INSERT INTO @TestTable VALUES (1324, 5, 6, 2)
INSERT INTO @TestTable VALUES (1234, 1, 9, 3)
INSERT INTO @TestTable VALUES (1324, 9, 6, 3)
Something to note is that the B column is always the same as it's only used once in this calculation, but is needed for the initial calculation.
I am attempting to subtract B from A on the first row, then on subsequent rows subtract the previous rows difference from A. Effectively, B - A = C
on the first then C - A
on all subsequent rows FOR THE RELATED ItemID.
Here are the results I'm expecting:
ItemID A B C Month RowNumber
1234 5 9 4 1 1
1234 6 9 -2 2 2
1234 1 9 -3 3 3
1324 14 6 -8 1 1
1324 5 6 -13 2 2
1324 9 6 -22 3 3
4321 5 11 6 1 1
4321 12 11 -6 2 2
Here is how I am accomplishing this.
;WITH CTE_TestValue AS (
SELECT
Main.ItemID,
Main.A,
Main.B,
Main.Month,
ROW_NUMBER() OVER (Partition BY Main.ItemID ORDER BY Main.Month) AS RowNumber
FROM @TestTable AS Main
),
CTE_TestColumnC AS (
SELECT
MainA.ItemID,
MainA.A,
MainA.B,
(MainA.B - MainA.A) AS C,
MainA.Month,
MainA.RowNumber
FROM CTE_TestValue AS MainA
WHERE MainA.Rownumber = 1
UNION ALL
SELECT
MainB.ItemID,
MainB.A,
MainB.B,
(Sub.C - MainB.A) AS C,
MainB.Month,
MainB.RowNumber
FROM CTE_TestValue AS MainB
INNER JOIN CTE_TestColumnC AS Sub
ON MainB.RowNumber - 1 = Sub.RowNumber
AND MainB.ItemID = Sub.ItemID
-- CROSS JOIN CTE_TestColumnC AS Sub
-- WHERE Sub.RowNumber + 1 = MainB.RowNumber
-- AND MainB.ItemID = Sub.ItemID
)
SELECT
Main.ItemID,
Main.A,
Main.B,
Main.C,
Main.Month,
Main.RowNumber
FROM CTE_TestColumnC AS Main
ORDER BY ItemID, Month, RowNumber
This works fine on a small data-sample, but I'm dealing with about 20,000 ItemId's each repeating 10 times. It finishes all the first row calculations instantly, as expected, and then the calculation times go up DRASTICALLY.
As you can see I've tried both an INNER JOIN
and a CROSS JOIN
. I believe they have the same execution plan with the parameters that I've given the CROSS JOIN
.
Is there a more effective/efficient way to accomplish this?
I allowed this to run for 5 hours yesterday to see if it ever ended.. it did not.
Another note: When I'm using this on the test data I SELECT
WITHOUT using ORDER
to hopefully help speed things along. The ORDER
is just for my convenience when I'm fact checking.