How to compute the moving average over the last n

2020-02-15 04:07发布

I am trying to compute efficiently (using SQL Server 2008) the moving average of the ProductCount over a period of 24 hours. For every single row in the Product table, I'd like to know what was the average of ProductCount (for that given products) over the last 24 hours. One problem with our data is that not all the dates/hours are present (see example below). If a TimeStamp is missing, it means that the ProductCount was 0.

I have a table with millions or rows with a Date, Product and Count. Below is a simplified example of the data I have to deal with.

Any idea on how to acheive that?

EDIT: One other piece of data that I need is the MIN and MAX ProductCount for the period (i.e. 24h). Computing the MIN/MAX is a bit trickier because of the missing values...

+---------------------+-------------+--------------+
|         Date        | ProductName | ProductCount |
+---------------------+-------------+--------------+
| 2012-01-01 00:00:00 |    Banana   |    15000     |
| 2012-01-01 01:00:00 |    Banana   |    16000     |
| 2012-01-01 02:00:00 |    Banana   |    17000     |
| 2012-01-01 05:00:00 |    Banana   |    12000     |
| 2012-01-01 00:00:00 |    Apple    |     5000     |
| 2012-01-01 05:00:00 |    Apple    |     6000     |
+---------------------+-------------+--------------+

SQL

CREATE TABLE ProductInventory (
    [Date]  DATETIME,
    [ProductName] NVARCHAR(50),
    [ProductCount] INT
)

INSERT INTO ProductInventory VALUES ('2012-01-01 00:00:00', 'Banana', 15000)
INSERT INTO ProductInventory VALUES ('2012-01-01 01:00:00', 'Banana', 16000)
INSERT INTO ProductInventory VALUES ('2012-01-01 02:00:00', 'Banana', 17000)
INSERT INTO ProductInventory VALUES ('2012-01-01 05:00:00', 'Banana', 12000)
INSERT INTO ProductInventory VALUES ('2012-01-01 00:00:00', 'Apple', 5000)
INSERT INTO ProductInventory VALUES ('2012-01-01 05:00:00', 'Apple', 6000)

2条回答
Melony?
2楼-- · 2020-02-15 04:20

I added to Lamak's answer to include min/max:

SELECT *
FROM ProductInventory A
OUTER APPLY (   
    SELECT 
        SUM(ProductCount) / 24 AS DailyMovingAverage, 
        MAX(ProductCount) AS MaxProductCount,
        CASE COUNT(*) WHEN 24 THEN MIN(ProductCount) ELSE 0 END AS MinProductCount
    FROM ProductInventory
    WHERE ProductName = A.ProductName 
    AND [Date] BETWEEN DATEADD(HOUR, -23, A.[Date]) AND A.[Date]) B

To account for missing records, check that there were indeed 24 records in the last 24 hours before using MIN(ProductCount), and return 0 otherwise.

Working SQL Fiddle, with a bunch (bushel?) of Oranges added to show the MinProductCount working

查看更多
虎瘦雄心在
3楼-- · 2020-02-15 04:35

Well, the fact that you need to calculate the average for every hour, actually makes this simpler, since you just need to SUM the product count and divide it by a fixed number (24). So I think that this will get the results you want (though in this particular case, a cursor by be actually faster):

SELECT A.*, B.ProductCount/24 DailyMovingAverage
FROM ProductInventory A
OUTER APPLY (   SELECT SUM(ProductCount) ProductCount
                FROM ProductInventory
                WHERE ProductName = A.ProductName 
                AND [Date] BETWEEN DATEADD(HOUR,-23,A.[Date]) AND A.[Date]) B
查看更多
登录 后发表回答