如何创建SQL中的移动平均线?
当前表:
Date Clicks
2012-05-01 2,230
2012-05-02 3,150
2012-05-03 5,520
2012-05-04 1,330
2012-05-05 2,260
2012-05-06 3,540
2012-05-07 2,330
所需的表或输出:
Date Clicks 3 day Moving Average
2012-05-01 2,230
2012-05-02 3,150
2012-05-03 5,520 4,360
2012-05-04 1,330 3,330
2012-05-05 2,260 3,120
2012-05-06 3,540 3,320
2012-05-07 2,330 3,010
Answer 1:
要做到这一点的方法之一是几次参加在同一个表。
select
(Current.Clicks
+ isnull(P1.Clicks, 0)
+ isnull(P2.Clicks, 0)
+ isnull(P3.Clicks, 0)) / 4 as MovingAvg3
from
MyTable as Current
left join MyTable as P1 on P1.Date = DateAdd(day, -1, Current.Date)
left join MyTable as P2 on P2.Date = DateAdd(day, -2, Current.Date)
left join MyTable as P3 on P3.Date = DateAdd(day, -3, Current.Date)
调整ON-条款的使用DateAdd组件匹配你是否希望你的移动平均线从严格过去,通现在还是天前通过天提前。
- 这很好地工作在您需要在只有几个数据点的移动平均线的情况。
- 这不是为比几个数据点更移动平均的最佳解决方案。
Answer 2:
这是一种常绿乔·塞科的问题。 我忽略了这DBMS平台使用。 但在任何情况下,乔能够超过10年前的回答与标准SQL。
乔·塞科 SQL困惑和解答引文:“这最后一次更新的尝试表明,我们可以使用谓词来构造查询,这将使我们的移动平均”
SELECT S1.sample_time, AVG(S2.load) AS avg_prev_hour_load
FROM Samples AS S1, Samples AS S2
WHERE S2.sample_time
BETWEEN (S1.sample_time - INTERVAL 1 HOUR)
AND S1.sample_time
GROUP BY S1.sample_time;
是额外的列或查询方法更好? 查询是技术上更好,因为更新的方法将非规范化的数据库。 然而,如果被记录的历史数据不会改变,并计算移动平均线是昂贵的,你可以考虑使用列方法。
MS SQL实施例:
CREATE TABLE #TestDW
( Date1 datetime,
LoadValue Numeric(13,6)
);
INSERT INTO #TestDW VALUES('2012-06-09' , '3.540' );
INSERT INTO #TestDW VALUES('2012-06-08' , '2.260' );
INSERT INTO #TestDW VALUES('2012-06-07' , '1.330' );
INSERT INTO #TestDW VALUES('2012-06-06' , '5.520' );
INSERT INTO #TestDW VALUES('2012-06-05' , '3.150' );
INSERT INTO #TestDW VALUES('2012-06-04' , '2.230' );
SQL查询的难题:
SELECT S1.date1, AVG(S2.LoadValue) AS avg_prev_3_days
FROM #TestDW AS S1, #TestDW AS S2
WHERE S2.date1
BETWEEN DATEADD(d, -2, S1.date1 )
AND S1.date1
GROUP BY S1.date1
order by 1;
Answer 3:
select t2.date, round(sum(ct.clicks)/3) as avg_clicks
from
(select date from clickstable) as t2,
(select date, clicks from clickstable) as ct
where datediff(t2.date, ct.date) between 0 and 2
group by t2.date
例如这里 。
很明显,你可以切换到任何你所需要的时间间隔。 你也可以使用count(),而不是一个神奇的数字,使其更容易改变,但也会慢下来。
Answer 4:
select *
, (select avg(c2.clicks) from #clicks_table c2
where c2.date between dateadd(dd, -2, c1.date) and c1.date) mov_avg
from #clicks_table c1
Answer 5:
使用不同的连接谓词:
SELECT current.date
,avg(periods.clicks)
FROM current left outer join current as periods
ON current.date BETWEEN dateadd(d,-2, periods.date) AND periods.date
GROUP BY current.date HAVING COUNT(*) >= 3
该声明其将防止任何未经日期至少N值被返回。
Answer 6:
假定x是要被平均的值和xDate是日期值:
选择平均(X)从myTable的WHERE xDate BETWEEN DATEADD(d,-2,xDate)和xDate
Answer 7:
为滚动平均值通用模板的大型数据集以及扩展
WITH moving_avg AS (
SELECT 0 AS [lag] UNION ALL
SELECT 1 AS [lag] UNION ALL
SELECT 2 AS [lag] UNION ALL
SELECT 3 AS [lag] --ETC
)
SELECT
DATEADD(day,[lag],[date]) AS [reference_date],
[otherkey1],[otherkey2],[otherkey3],
AVG([value1]) AS [avg_value1],
AVG([value2]) AS [avg_value2]
FROM [data_table]
CROSS JOIN moving_avg
GROUP BY [otherkey1],[otherkey2],[otherkey3],DATEADD(day,[lag],[date])
ORDER BY [otherkey1],[otherkey2],[otherkey3],[reference_date];
而对于加权移动平均值:
WITH weighted_avg AS (
SELECT 0 AS [lag], 1.0 AS [weight] UNION ALL
SELECT 1 AS [lag], 0.6 AS [weight] UNION ALL
SELECT 2 AS [lag], 0.3 AS [weight] UNION ALL
SELECT 3 AS [lag], 0.1 AS [weight] --ETC
)
SELECT
DATEADD(day,[lag],[date]) AS [reference_date],
[otherkey1],[otherkey2],[otherkey3],
AVG([value1] * [weight]) / AVG([weight]) AS [wavg_value1],
AVG([value2] * [weight]) / AVG([weight]) AS [wavg_value2]
FROM [data_table]
CROSS JOIN weighted_avg
GROUP BY [otherkey1],[otherkey2],[otherkey3],DATEADD(day,[lag],[date])
ORDER BY [otherkey1],[otherkey2],[otherkey3],[reference_date];
Answer 8:
为此目的,我想创建一个辅助/维日期表像
create table date_dim(date date, date_1 date, dates_2 date, dates_3 dates ...)
而date
是关键, date_1
这一天, date_2
包含此一天,前一天; date_3
...
然后,你可以做等于加入蜂巢。
使用类似的看法:
select date, date from date_dim
union all
select date, date_add(date, -1) from date_dim
union all
select date, date_add(date, -2) from date_dim
union all
select date, date_add(date, -3) from date_dim
Answer 9:
注意:这不是一个答案 ,但迭戈Scaravaggi的答案的增强的代码示例。 我张贴作为答案的评论部分是不够的。 请注意,我有参数为美化版移动aveage的时期。
declare @p int = 3
declare @t table(d int, bal float)
insert into @t values
(1,94),
(2,99),
(3,76),
(4,74),
(5,48),
(6,55),
(7,90),
(8,77),
(9,16),
(10,19),
(11,66),
(12,47)
select a.d, avg(b.bal)
from
@t a
left join @t b on b.d between a.d-(@p-1) and a.d
group by a.d
Answer 10:
--@p1 is period of moving average, @01 is offset
declare @p1 as int
declare @o1 as int
set @p1 = 5;
set @o1 = 3;
with np as(
select *, rank() over(partition by cmdty, tenor order by markdt) as r
from p_prices p1
where
1=1
)
, x1 as (
select s1.*, avg(s2.val) as avgval from np s1
inner join np s2
on s1.cmdty = s2.cmdty and s1.tenor = s2.tenor
and s2.r between s1.r - (@p1 - 1) - (@o1) and s1.r - (@o1)
group by s1.cmdty, s1.tenor, s1.markdt, s1.val, s1.r
)
Answer 11:
我不知道您预期的结果(输出)显示了典型的“简单移动(滚动)平均” 3天。 因为,例如,根据定义,第一个三数字的得出:
ThreeDaysMovingAverage = (2.230 + 3.150 + 5.520) / 3 = 3.6333333
但你期望4.360
和它的混乱。
不过,我建议如下解决方案,它使用窗口函数AVG
。 这种方法更有效(清晰和资源密集程度较低),比SELF-JOIN
在其他的答案介绍(我很惊讶,没有人给一个更好的解决方案)。
-- Oracle-SQL dialect
with
data_table as (
select date '2012-05-01' AS dt, 2.230 AS clicks from dual union all
select date '2012-05-02' AS dt, 3.150 AS clicks from dual union all
select date '2012-05-03' AS dt, 5.520 AS clicks from dual union all
select date '2012-05-04' AS dt, 1.330 AS clicks from dual union all
select date '2012-05-05' AS dt, 2.260 AS clicks from dual union all
select date '2012-05-06' AS dt, 3.540 AS clicks from dual union all
select date '2012-05-07' AS dt, 2.330 AS clicks from dual
),
param as (select 3 days from dual)
select
dt AS "Date",
clicks AS "Clicks",
case when rownum >= p.days then
avg(clicks) over (order by dt
rows between p.days - 1 preceding and current row)
end
AS "3 day Moving Average"
from data_table t, param p;
您将看到AVG
是包裹着case when rownum >= p.days then
迫使NULL
S IN第一行,其中“3.天移动平均线”是没有意义的。
Answer 12:
在蜂巢,也许你可以试试
select date, clicks, avg(clicks) over (order by date rows between 2 preceding and current row) as moving_avg from clicktable;
Answer 13:
我们可以将乔·塞科的“脏”左外连接方法(如上迭戈Scaravaggi引用)来回答这个问题,因为它是问。
declare @ClicksTable table ([Date] date, Clicks int)
insert into @ClicksTable
select '2012-05-01', 2230 union all
select '2012-05-02', 3150 union all
select '2012-05-03', 5520 union all
select '2012-05-04', 1330 union all
select '2012-05-05', 2260 union all
select '2012-05-06', 3540 union all
select '2012-05-07', 2330
这个查询:
SELECT
T1.[Date],
T1.Clicks,
-- AVG ignores NULL values so we have to explicitly NULLify
-- the days when we don't have a full 3-day sample
CASE WHEN count(T2.[Date]) < 3 THEN NULL
ELSE AVG(T2.Clicks)
END AS [3-Day Moving Average]
FROM @ClicksTable T1
LEFT OUTER JOIN @ClicksTable T2
ON T2.[Date] BETWEEN DATEADD(d, -2, T1.[Date]) AND T1.[Date]
GROUP BY T1.[Date]
生成所需的输出:
Date Clicks 3-Day Moving Average
2012-05-01 2,230
2012-05-02 3,150
2012-05-03 5,520 4,360
2012-05-04 1,330 3,330
2012-05-05 2,260 3,120
2012-05-06 3,540 3,320
2012-05-07 2,330 3,010
文章来源: SQL moving average