sql self join CTE or other way

2019-09-11 17:20发布

I need to create a column which shows the difference between the average of two consecutive days of a table, I tried to do this with a CTE query:

with aggregate
as
(
select date  
        ,y
        ,z
       ,avg(x) as vwap  
from table a  
where z = 1
group by date,y,z
)
select aggregate.date
       ,aggregate.vwap - aggregate2.vwap
from aggregate
inner join aggregate aggregate2
on date = dateadd(day,-1,aggregate2.date) and aggregate.y = aggregate2.y

this Query takes 29 second to run, while the first select takes only 2 second and returns only 2000 rows.

what is a more efficient way to do this? does it maybe make sense to create a view of the first query?

Thanks!

So I found out the lag/lead is not an option as my server is actually from 2008 and only the management studio is 2012, anyone have an idea how to do this efficient?

3条回答
2楼-- · 2019-09-11 17:43

The problem was I think that for every join or aggregate in the final select the whole CTE was rerun. When I created a temporary table of the CTE results and than run my Inner join final query on that the speed increased by factor 10.

查看更多
啃猪蹄的小仙女
3楼-- · 2019-09-11 18:02

You can use new SQL analytic functions SQL Lead() and Lag() function introduced with SQL Server 2012

Please check following SELECT statement

;with [aggregate] as (
    select 
        [date], avg(x) as vwap
    from table a  
    group by date,y,z
)
select
    [date],
    vwap,
    previous = lag(vwap,1,null) over (order by date),
    [next] = lead(vwap,1,null) over (order by date)
from [aggregate]

You can improve the above SELECT with adding difference calculation too

;with [aggregate] as (
    select 
        [date], avg(x) as vwap
    from table a  
    group by date,y,z
)
select
    [date],
    vwap,
    previous = lag(vwap,1,null) over (order by date),
    prev_diff = vwap - (lag(vwap,1,null) over (order by date)),
    [next] = lead(vwap,1,null) over (order by date),
    next_diff = vwap - (lead(vwap,1,null) over (order by date))
from [aggregate]
查看更多
不美不萌又怎样
4楼-- · 2019-09-11 18:05

If you are using a SQL Server version which is before SQL Server 2012, you can use CTE's joining itself instead of LEAD() and LAG() analytic functions

Here is a sample

;with [aggregate] as (
    select 
        ROW_NUMBER() OVER (Order By [date]) rn,
        [date], 
        avg(x) as vwap
    from a  
    group by date--,y,z
)
select
    curr.[date],
    curr.vwap,
    previous = prev.vwap,
    prev_diff = curr.vwap - prev.vwap,
    [next] = nxt.vwap,
    next_diff = curr.vwap - nxt.vwap
from [aggregate] curr
left join [aggregate] prev on curr.rn = (prev.rn + 1)
left join [aggregate] nxt on (curr.rn + 1) = nxt.rn

You can find a similar case explained in following SQL tutorial: Select Next and Previous Row with Current Row using SQL CTE and Row_Number() function

One important note, in your query I see group by clause with additional fields.

You should be adding those in Row_Number() code instead of

ROW_NUMBER() OVER (Order By [date]) rn,

as follows

ROW_NUMBER() OVER (Partition By y,z Order By [date]) rn,
查看更多
登录 后发表回答