Consider a column that represents "time between events":
(5, 40, 3, 6, 0, 9, 0, 4, 5, 18, 2, 4, 3, 2)
I would like to group these into buckets of 30, but buckets that reset. Desired outcome:
(0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2)
This is because, when we get to a cumulative 30, we "reset" and begin counting again. So, 5 + 40 > 30, we drop down to zero and begin cumulative adding until we reach 30...(3 + 6 + 0 ...), which happens at when we reach 10th element == 18.
This can be implemented via a Reduce
function (see this answer) but I cannot figure out how to implement this in Teradata? It's like I need to be able to refer to the same OVER(PARTITION BY ...
within the same call.
Breaking down the logic, here's an example in Excel:
Where, B2
has the formula: =IF(B1<30, B1+A2, A2)
and is dragged down accordingly. Column C
then checks if column B
is >=30, and column D
is a simple cumsum of column B
.
Only way I know of to do this in Teradata is to use a recursive CTE. Because I'm lazy, let's simplify this down to say you want to reset when your running sum is greater than 2.
Creating and populating a really simple volatile table for this:
CREATE VOLATILE TABLE vt1
(
foo VARCHAR(10)
, counter INTEGER
, bar INTEGER
)
ON COMMIT PRESERVE ROWS;
INSERT INTO vt1 VALUES ('a', 1, '1');
INSERT INTO vt1 VALUES ('a', 2, '2');
INSERT INTO vt1 VALUES ('a', 3, '2');
INSERT INTO vt1 VALUES ('a', 4, '4');
INSERT INTO vt1 VALUES ('a', 5, '1');
INSERT INTO vt1 VALUES ('b', 1, '3');
INSERT INTO vt1 VALUES ('b', 2, '1');
INSERT INTO vt1 VALUES ('b', 3, '1');
INSERT INTO vt1 VALUES ('b', 4, '2');
Here's the actual select:
WITH RECURSIVE cte (foo, counter, bar, rsum) AS
(
SELECT
foo
, counter
, bar
, bar AS rsum
FROM
vt1
QUALIFY ROW_NUMBER() OVER (PARTITION BY foo ORDER BY counter) = 1
UNION ALL
SELECT
t.foo
, t.counter
, t.bar
, CASE WHEN cte.rsum < 3 THEN t.bar + cte.rsum ELSE t.bar END
FROM
vt1 t JOIN cte ON t.foo = cte.foo AND t.counter = cte.counter + 1
)
SELECT
cte.*
, CASE WHEN rsum < 5 THEN 0 ELSE 1 END AS tester
FROM
cte
ORDER BY
foo
, counter
;
Which will finally give us:
╔═════╦═════════╦═════╦══════╦════════╗
║ foo ║ counter ║ bar ║ rsum ║ tester ║
╠═════╬═════════╬═════╬══════╬════════╣
║ a ║ 1 ║ 1 ║ 1 ║ 0 ║
║ a ║ 2 ║ 2 ║ 3 ║ 0 ║
║ a ║ 3 ║ 2 ║ 5 ║ 1 ║
║ a ║ 4 ║ 4 ║ 4 ║ 0 ║
║ a ║ 5 ║ 1 ║ 5 ║ 1 ║
║ b ║ 1 ║ 3 ║ 3 ║ 0 ║
║ b ║ 2 ║ 1 ║ 4 ║ 0 ║
║ b ║ 3 ║ 1 ║ 5 ║ 1 ║
║ b ║ 4 ║ 2 ║ 2 ║ 0 ║
╚═════╩═════════╩═════╩══════╩════════╝
The case statements handles the reset for us.
It's kind of ugly, but I've never been able to get this to work any other way.