让我们想象一下,你有一个名为表1从一个在线UDF返回按时间顺序排列的订单下表。 请注意,将订单可能是不同步的,所以我特意创建了一个有异常(即我没有将日期字段,但是我对你的访问列,如果更容易)。
OrderID BuySell FilledSize ExecutionPrice RunningTotal AverageBookCost RealisedPnL
339 Buy 2 24.5 NULL NULL NULL
375 Sell 3 23.5 NULL NULL NULL
396 Sell 3 20.5 NULL NULL NULL
416 Sell 1 16.4 NULL NULL NULL
405 Buy 4 18.2 NULL NULL NULL
421 Sell 1 16.7 NULL NULL NULL
432 Buy 3 18.6 NULL NULL NULL
我有我想从顶部递归地适用于将计算3 NULL列,不过imputs到功能将从以前的调用输出底部的功能。 我创建了函数被调用mfCalc_RunningTotalBookCostPnL我已经附在下面这
CREATE FUNCTION [fMath].[mfCalc_RunningTotalBookCostPnL](
@BuySell VARCHAR(4),
@FilledSize DECIMAL(31,15),
@ExecutionPrice DECIMAL(31,15),
@OldRunningTotal DECIMAL(31,15),
@OldBookCost DECIMAL(31,15)
)
RETURNS @ReturnTable TABLE(
NewRunningTotal DECIMAL(31,15),
NewBookCost DECIMAL(31,15),
PreMultRealisedPnL DECIMAL(31,15)
)
AS
BEGIN
DECLARE @SignedFilledSize DECIMAL(31,15),
@NewRunningTotal DECIMAL(31,15),
@NewBookCost DECIMAL(31,15),
@PreMultRealisedPnL DECIMAL(31,15)
SET @SignedFilledSize = fMath.sfSignedSize(@BuySell, @FilledSize)
SET @NewRunningTotal = @OldRunningTotal + @SignedFilledSize
SET @PreMultRealisedPnL = 0
IF SIGN(@SignedFilledSize) = SIGN(@OldRunningTotal)
-- This Trade is adding to the existing position.
SET @NewBookCost = (@SignedFilledSize * @ExecutionPrice +
@OldRunningTotal * @OldBookCost) / (@NewRunningTotal)
ELSE
BEGIN
-- This trade is reversing the existing position.
-- This could be buying when short or selling when long.
DECLARE @AbsClosedSize DECIMAL(31,15)
SET @AbsClosedSize = fMath.sfMin(ABS(@SignedFilledSize), ABS(@OldRunningTotal));
-- There must be Crystalising of PnL.
SET @PreMultRealisedPnL = (@ExecutionPrice - @OldBookCost) * @AbsClosedSize * SIGN(-@SignedFilledSize)
-- Work out the NewBookCost
SET @NewBookCost = CASE
WHEN ABS(@SignedFilledSize) < ABS(@OldRunningTotal) THEN @OldBookCost
WHEN ABS(@SignedFilledSize) = ABS(@OldRunningTotal) THEN 0
WHEN ABS(@SignedFilledSize) > ABS(@OldRunningTotal) THEN @ExecutionPrice
END
END
-- Insert values into Return Table
INSERT INTO @ReturnTable
VALUES (@NewRunningTotal, @NewBookCost, @PreMultRealisedPnL)
-- Return
RETURN
END
所以我要寻找的T-SQL命令(我不介意,如果有人可以创建一个外部施加太大)将产生以下结果/解决方案集:
OrderID BuySell FilledSize ExecutionPrice RunningTotal AverageBookCost RealisedPnL
339 Buy 2 24.5 2 24.5 0
375 Sell 3 23.5 -1 23.5 -2
396 Sell 3 20.5 -4 21.25 0
416 Sell 1 16.4 -5 20.28 0
405 Buy 4 18.2 -1 20.28 8.32
421 Sell 1 16.7 -2 18.49 0
432 Buy 3 18.6 1 18.6 -0.29
的几点注意事项,上述存储过程调用平凡函数fMath.sfSignedSize这只是使(“卖出”,3)= -3。 此外,为免生疑问,我想看到的解决方案使得按以下顺序进行调用假设我在我的计算正确的! (请注意,我开始假设OldRunningTotal和OldBookCost均为零):
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Buy',2,24.5,0,0)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Sell',3,23.5,2,24.5)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Sell',3,20.5,-1,23.5)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Sell',1,16.4,-4,21.25)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Buy',4,18.2,-5,20.28)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Sell',1,16.7,-1,20.28)
SELECT * FROM fMath.mfCalc_RunningTotalBookCostPnL('Buy',3,18.6,-2,18.49)
显然,[fMath] [mfCalc_RunningTotalBookCostPnL]可能需要进行调整,以便它可以用NULL项为OldRunningTotal和OldBookCost开始但这是平凡完成。 应用resursive性质的SQL集理论是有点困难。
非常感谢,伯蒂。
Answer 1:
这是一个位在黑暗中刺的不充分运行[fMath] [mfCalc_RunningTotalBookCostPnL]来测试。 我与试验前得到递归CTE的在第一时间记录是只有50%左右,但即使不完美,应该足以让你开始,如果我正确地理解您的需求:
-- First, cache Table1 into #temp to improve recursive CTE performance
select
RowNum=ROW_NUMBER()OVER(ORDER BY OrderID)
, *
INTO #temp
FROM Table1;
GO
; WITH CTE (RowNum,OrderID, BuySell, FilledSize, ExecutionPrice, RunningTotal, AverageBookCost, RealisedPnL) AS (
SELECT RowNum,OrderID, BuySell, FilledSize, ExecutionPrice, RunningTotal=0, AverageBookCost=0, RealisedPnL=0
FROM #temp
WHERE RowNum=1
UNION ALL
SELECT t.RowNum, t.OrderID, t.BuySell, t.FilledSize, t.ExecutionPrice
, RunningTotal=c.NewRunningTotal, AverageBookCost=c.NewBookCost, RealisedPnL=c.PreMultRealisedPnL
FROM #temp t
INNER JOIN CTE ON CTE.RowNum+1 = t.RowNum
CROSS APPLY [fMath].[mfCalc_RunningTotalBookCostPnL](t.BuySell, t.FilledSize, t.ExecutionPrice, CTE.RunningTotal, CTE.AverageBookCost) AS c
)
SELECT OrderID, BuySell, FilledSize, ExecutionPrice, RunningTotal, AverageBookCost, RealisedPnL
FROM CTE
/* Replace the above SELECT with the following after testing ok
UPDATE tab
SET RunningTotal=CTE.RunningTotal
, AverageBookCost=CTE.AverageBookCost
, RealisedPnL=CTE.RealisedPnL
FROM Table1 tab
INNER JOIN CTE on CTE.OrderID=tab.OrderID
*/
OPTION (MAXRECURSION 32767);
GO
-- clean up
DROP TABLE #temp
GO
还有一个声明 - 递归的CTE是好的为32767最大深度。如果这是过于严格,则需要探索两种不同的方法,或者对数据集某种开窗。
Answer 2:
运行总。 UPDATE临时表VS CTE
create table Test(
OrderID int primary key,
Qty int not null
);
declare @i int = 1;
while @i <= 5000 begin
insert into Test(OrderID, Qty) values (@i * 2,rand() * 10);
set @i = @i + 1;
end;
递归溶液需要9秒:
with T AS
(
select ROW_NUMBER() over(order by OrderID) as rn, * from test
)
,R(Rn, OrderId, Qty, RunningTotal) as
(
select Rn, OrderID, Qty, Qty
from t
where rn = 1
union all
select t.Rn, t.OrderId, t.Qty, p.RunningTotal + t.Qty
from t t
join r p on t.rn = p.rn + 1
)
select R.OrderId, R.Qty, R.RunningTotal from r
option(maxrecursion 0);
UPDATE表取0秒:
create function TestRunningTotal()
returns @ReturnTable table(
OrderId int, Qty int, RunningTotal int
)
as begin
insert into @ReturnTable(OrderID, Qty, RunningTotal)
select OrderID, Qty, 0 from Test
order by OrderID;
declare @RunningTotal int = 0;
update @ReturnTable set
RunningTotal = @RunningTotal,
@RunningTotal = @RunningTotal + Qty;
return;
end;
这两个方法至少可以给你一个框架,在以构建查询。
BTW在SQL Server中,不像在MySQL中,变量赋值的顺序并不重要。 这个:
update @ReturnTable set
RunningTotal = @RunningTotal,
@RunningTotal = @RunningTotal + Qty;
而下面:
update @ReturnTable set
@RunningTotal = @RunningTotal + Qty,
RunningTotal = @RunningTotal;
他们都执行相同的方式,即变量赋值先有,无论变量赋值的语句中的位置。 这两个查询有这些相同的输出:
OrderId Qty RunningTotal
----------- ----------- ------------
2 4 4
4 8 12
6 4 16
8 5 21
10 3 24
12 8 32
14 2 34
16 9 43
18 1 44
20 2 46
22 0 46
24 2 48
26 6 54
在您确切的表,只是检测买/卖,您可以通过1乘以-1分别,或者你只是签署领域,如:
update @ReturnTable set
@RunningTotal = @RunningTotal +
CASE WHEN BuySell = 'Buy' THEN Qty ELSE -Qty END,
RunningTotal = @RunningTotal;
如果你碰巧升级到SQL Server 2012,这里的直接实现运行总计:
select OrderID, Qty, sum(Qty) over(order by OrderID) as RunningTotal
from Test
你的具体问题:
select OrderID, Qty,
sum(CASE WHEN BuySell = 'Buy' THEN Qty ELSE -Qty END)
over(order by OrderID) as RunningTotal
from Test;
UPDATE
如果你感到不安与古怪的更新 ,你可以把保护条款,以检查是否将要更新的行的顺序的原始顺序(由身份(1,1)辅助)匹配:
create function TestRunningTotalGuarded()
returns @ReturnTable table(
OrderId int, Qty int,
RunningTotal int not null,
RN int identity(1,1) not null
)
as begin
insert into @ReturnTable(OrderID, Qty, RunningTotal)
select OrderID, Qty, 0 from Test
order by OrderID;
declare @RunningTotal int = 0;
declare @RN_check INT = 0;
update @ReturnTable set
@RN_check = @RN_check + 1,
@RunningTotal =
(case when RN = @RN_check then @RunningTotal + Qty else 1/0 end),
RunningTotal = @RunningTotal;
return;
end;
如果真的UPDATE更新不可预知的顺序排(或任何的机会,它会),该@RN_Check将不等于RN(身份顺序)了,该代码将引发除以零错误 ,那么。 使用条款后卫,不可预测的更新订单将快速失败 ; 如果这则发生了,这将是提交Bug请愿书,微软做出古怪的更新没那么古怪:-)时间
在固有的必要操作(变量赋值)将保护条款对冲真的是连续的。
Answer 3:
我重拍运行总计查询包括一个分区(客户)
CTE的方法:
with T AS
(
select
ROW_NUMBER() over(partition by CustomerCode order by OrderID) as rn, *
from test
)
,R(CustomerCode, Rn, OrderId, Qty, RunningTotal) as
(
select CustomerCode, Rn, OrderID, Qty, Qty
from t
where rn = 1
union all
select t.CustomerCode, t.Rn, t.OrderId, t.Qty, p.RunningTotal + t.Qty
from t t
join r p on p.CustomerCode = t.CustomerCode and t.rn = p.rn + 1
)
select R.CustomerCode, R.OrderId, R.Qty, R.RunningTotal from r
order by R.CustomerCode, R.OrderId
option(maxrecursion 0);
古怪的更新方法:
create function TestRunningTotalGuarded()
returns @ReturnTable table(
CustomerCode varchar(50), OrderId int, Qty int,
RunningTotal int not null, RN int identity(1,1) not null
)
as begin
insert into @ReturnTable(CustomerCode, OrderID, Qty, RunningTotal)
select CustomerCode, OrderID, Qty, 0 from Test
order by CustomerCode, OrderID;
declare @RunningTotal int;
declare @RN_check INT = 0;
declare @PrevCustomerCode varchar(50) = NULL;
update @ReturnTable set
@RN_check = @RN_check + 1,
@RunningTotal =
(case when RN = @RN_check then
case when @PrevCustomerCode = CustomerCode then
@RunningTotal + Qty
else
Qty
end
else
1/0
end),
@PrevCustomerCode = CustomerCode,
RunningTotal = @RunningTotal;
return;
end;
光标的方法(压实删除滚动条代码)
create function TestRunningTotalCursor()
returns @ReturnTable table(CustomerCode varchar(50), OrderId int,
Qty int, RunningTotal int not null) as
begin
declare @c_CustomerCode varchar(50);
declare @c_OrderID int;
declare @c_qty int;
declare @PrevCustomerCode varchar(50) = null;
declare @RunningTotal int = 0;
declare o_cur cursor for
select CustomerCode, OrderID, Qty from Test order by CustomerCode, OrderID;
open o_cur;
fetch next from o_cur into @c_CustomerCode, @c_OrderID, @c_Qty;
while @@FETCH_STATUS = 0 begin
if @c_CustomerCode = @PrevCustomerCode begin
set @RunningTotal = @RunningTotal + @c_qty;
end else begin
set @RunningTotal = @c_Qty;
end;
set @PrevCustomerCode = @c_CustomerCode;
insert into @ReturnTable(CustomerCode, OrderId, Qty, RunningTotal)
values(@c_CustomerCode, @c_OrderID, @c_Qty, @RunningTotal);
fetch next from o_cur into @c_CustomerCode, @c_OrderID, @c_Qty;
end;
close o_cur; deallocate o_cur; return;
end;
在5000行的指标:
* Recursive CTE : 49 seconds
* Quirky Update : 0 second
* Cursor : 0 second
这些0秒钟,则没有意义。 之后我碰到行至50000,这里的指标:
* Quirky Update : 1 second
* Cursor : 3 second
* Recursive CTE : An hour
买者,我发现了这离奇的更新实在是古怪,有时它的工作原理,有时它不(通过除以零误差对五分之一的运行查询的存在表示)。
下面是对数据的DDL:
create table Test(
OrderID int primary key,
CustomerCode varchar(50),
Qty int not null
);
declare @i int = 1;
while @i <= 20 begin
insert into Test(OrderID, CustomerCode, Qty) values (
@i * 2
,case @i % 4
when 0 then 'JOHN'
when 1 then 'PAUL'
when 2 then 'GEORGE'
when 3 then 'RINGO'
end
,rand() * 10);
set @i = @i + 1;
end;
UPDATE
显然,单纯的CTE的做法是不好的。 必须使用一种混合的方法。 当行编号物化到一个实际的表,速度上升
select ROW_NUMBER() over(partition by CustomerCode order by OrderID) as rn, * into #xxx
from test;
with T AS
(
select * from #xxx
)
,R(CustomerCode, Rn, OrderId, Qty, RunningTotal) as
(
select CustomerCode, Rn, OrderID, Qty, Qty
from t
where rn = 1
union all
select t.CustomerCode, t.Rn, t.OrderId, t.Qty, p.RunningTotal + t.Qty
from t t
join r p on p.CustomerCode = t.CustomerCode and t.rn = p.rn + 1
)
select R.CustomerCode, R.OrderId, R.Qty, R.RunningTotal from r
order by R.CustomerCode, R.OrderId
option(maxrecursion 0);
drop table #xxx;
总括来说,在这里是转换纯CTE使用物化行编号之前的指标(行编号的结果在实际的表,即,在临时表)
* Quirky Update : 1 second
* Cursor : 3 second
* Recursive CTE(Pure) : An hour
物化行编号到临时表后:
* Quirky Update : 1 second
* Cursor : 3 second
* Recursive CTE(Hybrid) : 2 second (inclusive of row numbering table materialization)
混合递归CTE的做法实际上比光标的方法更快。
另一个更新
仅仅通过把一个聚集主键的顺序列,其物理顺序的UPDATE更新行。 没有更多的分频零(后卫子句检测非顺序更新)发生。 例如
alter function TestRunningTotalGuarded()
returns @ReturnTable table(
CustomerCode varchar(50), OrderId int, Qty int,
RunningTotal int not null,
RN int identity(1,1) not null primary key clustered
)
我试图运行古怪的更新(在地方聚集主键)的100倍,如果有可能的角落的情况下,我没有发现至今。 我还没有遇到任何除以零错误。 阅读在本博客文章底部的结论: http://www.ienablemuch.com/2012/05/recursive-cte-is-evil-and-cursor-is.html
而且它甚至在地方聚集主键还是很快的。
这里有10万行的指标:
Quirky Update : 3 seconds
Hybrid Recursive CTE : 5 seconds
Cursor : 6 seconds
古怪的更新(这毕竟不是那么古怪)依然很快。 这是比混合动力递归CTE更快。
文章来源: Can this Recursive Solution be written up into a T-SQL Query using CTE or OVER?