What is ROWS UNBOUNDED PRECEDING used for in Terad

2019-01-22 07:16发布

问题:

I am just starting on Teradata and I have come across an Ordered Analytical Function called "Rows unbounded preceding" in Teradata. I tried several sites to learn about the function but all of them uses a complicated example explaining the same. Could you please provide me with a naive example so that I can get the basics clear.

回答1:

It's the "frame" or "range" clause of window functions, which are part of the SQL standard and implemented in many databases, including Teradata.

A simple example would be to calculate the average amount in a frame of three days. I'm using PostgreSQL syntax for the example, but it will be the same for Teradata:

WITH data (t, a) AS (
  VALUES(1, 1),
        (2, 5),
        (3, 3),
        (4, 5),
        (5, 4),
        (6, 11)
)
SELECT t, a, avg(a) OVER (ORDER BY t ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM data
ORDER BY t

... which yields:

t  a  avg
----------
1  1  3.00
2  5  3.00
3  3  4.33
4  5  4.00
5  4  6.67
6 11  7.50

As you can see, each average is calculated "over" an ordered frame consisting of the range between the previous row (1 preceding) and the subsequent row (1 following).

When you write ROWS UNBOUNDED PRECEDING, then the frame's lower bound is simply infinite. This is useful when calculating sums (i.e. "running totals"), for instance:

WITH data (t, a) AS (
  VALUES(1, 1),
        (2, 5),
        (3, 3),
        (4, 5),
        (5, 4),
        (6, 11)
)
SELECT t, a, sum(a) OVER (ORDER BY t ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM data
ORDER BY t

yielding...

t  a  sum
---------
1  1    1
2  5    6
3  3    9
4  5   14
5  4   18
6 11   29

Here's another very good explanations of SQL window functions.



回答2:

ROWS UNBOUNDED PRECEDING is no Teradata-specific syntax, it's Standard SQL. Together with the ORDER BY it defines the window on which the result is calculated.

Logically a Windowed Aggregate Function is newly calculated for each row within the PARTITION based on all ROWS between a starting row and an ending row.

Starting and ending rows might be fixed or relative to the current row based on the following keywords:

  • UNBOUNDED PRECEDING, all rows before the current row -> fixed
  • UNBOUNDED FOLLOWING, all rows after the current row -> fixed
  • x PRECEDING, x rows before the current row -> relative
  • y FOLLOWING, y rows after the current row -> relative

Possible kinds of calculation include:

  • Both starting and ending row are fixed, the window consists of all rows of a partition, e.g. a Group Sum, i.e. aggregate plus detail rows
  • One end is fixed, the other relative to current row, the number of rows increases or decreases, e.g. a Running Total, Remaining Sum
  • Starting and ending row are relative to current row, the number of rows within a window is fixed, e.g. a Moving Average over n rows

So SUM(x) OVER (ORDER BY col ROWS UNBOUNDED PRECEDING) results in a Cumulative Sum or Running Total

11 -> 11
 2 -> 11 +  2                = 13
 3 -> 13 +  3 (or 11+2+3)    = 16
44 -> 16 + 44 (or 11+2+3+44) = 60