可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I trying to perform following calculation
Sample data:
CREATE TABLE #Table1
(
rno int identity(1,1),
ccp varchar(50),
[col1] INT,
[col2] INT,
[col3] INT,
col4 as [col2]/100.0
);
INSERT INTO #Table1
(ccp,[col1],[col2],[col3])
VALUES ('ccp1',15,10,1100),
('ccp1',20,10,1210),
('ccp1',30,10,1331),
('ccp2',10,15,900),
('ccp2',15,15,1000),
('ccp2',20,15,1010)
+-----+------+------+------+------+----------+
| rno | ccp | col1 | col2 | col3 | col4 |
+-----+------+------+------+------+----------+
| 1 | ccp1 | 15 | 10 | 1100 | 0.100000 |
| 2 | ccp1 | 20 | 10 | 1210 | 0.100000 |
| 3 | ccp1 | 30 | 10 | 1331 | 0.100000 |
| 4 | ccp2 | 10 | 15 | 900 | 0.150000 |
| 5 | ccp2 | 15 | 15 | 1000 | 0.150000 |
| 6 | ccp2 | 20 | 15 | 1010 | 0.150000 |
+-----+------+------+------+------+----------+
Note : It is not just 3
records each ccp can have N
no.of records
Expected Result :
1083.500000 --1100 - (15 * (1+0.100000))
1169.850000 --1210 - ((20 * (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000)) )
1253.835000 --1331 - ((30 * (1+0.100000)) + (20 * (1+0.100000)* (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000) *(1+0.100000)) )
888.500000 --900 - (10 * (1+0.150000))
969.525000 --1000 - ((15 * (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000)) )
951.953750 --1010 - ((20 * (1+0.150000)) + (15 * (1+0.150000)* (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000) *(1+0.150000)) )
I know we can do this using Recursive CTE, it is not efficient since i have to do this for more than 5 million records.
I am looking to implement something like this set based approach
For ccp : ccp1
SELECT col3 - ( col1 * ( 1 + col4 ) )
FROM #Table1
WHERE rno = 1
SELECT rno,
col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
OVER(
ORDER BY rno ) * Power(( 1 + col4 ), 2) ) )
FROM #Table1
WHERE rno IN ( 1, 2 )
SELECT rno,
col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
OVER(
ORDER BY rno ) * Power(( 1 + col4 ), 2) ) + ( Lag(col1, 2)
OVER(
ORDER BY rno ) * Power(( 1 + col4 ), 3) ) )
FROM #Table1
WHERE rno IN ( 1, 2, 3 )
Is there a way to calculate in single query?
Update :
Still am open to suggestions. I strongly beleive there should be some to do this using SUM () Over(Order by)
window aggregate function.
回答1:
An approach with a self join
. Not sure if this would be any more efficient than your version with cross apply
.
WITH T AS
(SELECT *,
ROW_NUMBER() OVER(PARTITION BY CCP
ORDER BY RNO) AS RN
FROM #TABLE1)
SELECT T1.RNO,
T1.CCP,
T1.COL1,
T1.COL2,
T1.COL3,
T1.COL3-SUM(T2.COL1*POWER(1+T1.COL2/100.0,T1.RN-T2.RN+1)) AS RES
FROM T T1
JOIN T T2 ON T1.CCP=T2.CCP
AND T1.RN>=T2.RN
GROUP BY T1.RNO,
T1.CCP,
T1.COL1,
T1.COL2,
T1.COL3
Sample Demo
回答2:
Finally I achieved the result using below approach
SELECT a.*,
col3 - res AS Result
FROM #TABLE1 a
CROSS apply (SELECT Sum(b.col1 * Power(( 1 + b.COL2 / 100.00 ), new_rn)) AS res
FROM (SELECT Row_number()
OVER(
partition BY ccp
ORDER BY rno DESC) new_rn,*
FROM #TABLE1 b
WHERE a.ccp = b.ccp
AND a.rno >= b.rno)b) cs
Result :
+-----+------+------+------+------+----------+-------------+
| rno | ccp | col1 | col2 | col3 | col4 | Result |
+-----+------+------+------+------+----------+-------------+
| 1 | ccp1 | 15 | 10 | 1100 | 0.100000 | 1083.500000 |
| 2 | ccp1 | 20 | 10 | 1210 | 0.100000 | 1169.850000 |
| 3 | ccp1 | 30 | 10 | 1331 | 0.100000 | 1253.835000 |
| 4 | ccp2 | 10 | 15 | 900 | 0.150000 | 888.500000 |
| 5 | ccp2 | 15 | 15 | 1000 | 0.150000 | 969.525000 |
| 6 | ccp2 | 20 | 15 | 1010 | 0.150000 | 951.953750 |
+-----+------+------+------+------+----------+-------------+
回答3:
This answer may be disappointing but you'll likely find that an iterative CLR approach performs competitively with any TSQL approach.
Try the following (based on Running sums yet again: SQLCLR saves the day!)
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
public partial class StoredProcedures
{
[Microsoft.SqlServer.Server.SqlProcedure]
public static void StackoverflowQuestion41803909()
{
using (SqlConnection conn = new SqlConnection("context connection=true;"))
{
SqlCommand comm = new SqlCommand();
comm.Connection = conn;
comm.CommandText = @"
SELECT [rno],
[ccp],
[col1],
[col2],
[col3],
[col4]
FROM Table1
ORDER BY ccp,
rno
";
SqlMetaData[] columns = new SqlMetaData[7];
columns[0] = new SqlMetaData("rno", SqlDbType.Int);
columns[1] = new SqlMetaData("ccp", SqlDbType.VarChar, 50);
columns[2] = new SqlMetaData("col1", SqlDbType.Int);
columns[3] = new SqlMetaData("col2", SqlDbType.Int);
columns[4] = new SqlMetaData("col3", SqlDbType.Int);
columns[5] = new SqlMetaData("col4", SqlDbType.Decimal, 17, 6);
columns[6] = new SqlMetaData("result", SqlDbType.Decimal, 17, 6);
SqlDataRecord record = new SqlDataRecord(columns);
SqlContext.Pipe.SendResultsStart(record);
conn.Open();
SqlDataReader reader = comm.ExecuteReader();
string prevCcp = null;
decimal offset = 0;
while (reader.Read())
{
string ccp = (string)reader[1];
int col1 = (int)reader[2];
int col3 = (int)reader[4];
decimal col4 = (decimal)reader[5];
if (prevCcp != ccp)
{
offset = 0;
}
offset = ((col1 + offset) * (1 + col4));
record.SetInt32(0, (int)reader[0]);
record.SetString(1, ccp);
record.SetInt32(2, col1);
record.SetInt32(3, (int)reader[3]);
record.SetInt32(4, col3);
record.SetDecimal(5, col4);
record.SetDecimal(6, col3 - offset);
SqlContext.Pipe.SendResultsRow(record);
prevCcp = ccp;
}
SqlContext.Pipe.SendResultsEnd();
}
}
};
回答4:
Another option
CREATE TABLE #Table1
(
rno int identity(1,1),
ccp varchar(50),
[col1] INT,
[col2] INT,
[col3] INT,
col4 as [col2]/100.0
);
INSERT INTO #Table1
(ccp,[col1],[col2],[col3])
VALUES ('ccp1',15,10,1100),
('ccp1',20,10,1210),
('ccp1',30,10,1331),
('ccp1',40,10,1331),
('ccp2',10,15,900),
('ccp2',15,15,1000),
('ccp2',20,15,1010);
select t.*, col3-s
from(
select *, rn = row_number() over(partition by ccp order by rno)
from #Table1
) t
cross apply (
select s=sum(pwr*col1)
from(
select top(rn)
col1, pwr = power(1+col4, rn + 1 - row_number() over(order by rno))
from #Table1 t2
where t2.ccp=t.ccp
order by row_number() over(order by rno)
)t3
)t4
order by rno;
回答5:
Try this:
;with
val as (
select
*,
(1 + col2 / 100.00) val,
row_number() over(partition by ccp order by rno desc) rn
from #Table1),
res as (
select
v1.rno,
--min(v1.ccp) ccp,
--min(v1.col1) col1,
--min(v1.col2) col2,
min(v1.col3) col3,
sum(v2.col1 * power(v2.val, 1 + v2.rn - v1.rn)) sum_val
from val v1
left join val v2 on v2.ccp = v1.ccp and v2.rno <= v1.rno
group by v1.rno)
select *, col3 - isnull(sum_val, 0)
from res
But performance depends on indexes. Post index structure for details. Best performance can be achieved when you will split it into more temporary tables.
回答6:
After playing with it for some time I believe the answer to the bounty question of whether or not this can be done with a sum() over (order by)
is NO. This code is as close as I could get:
select *, col3 - sum(col1 * power(1 + col4, row_num)) over (partition by ccp order by col1)
from (
select *, row_number() over (partition by ccp order by rno asc) row_num
from @Table1
) a
order by 1,2;
This will return correct results for the first row in each ccp
group. By calculating row_num using rno desc
instead then the final row in each ccp
will be correct.
It appears that the only ways to get this to work in the simple way that the syntax suggests would be:
- Syntax support to reference the actual row within the aggregate over function. This does exist in T-SQL as far as I can find.
- Syntax support for a window function within a window function. This also is not permitted in T-SQL per the following error:
Windowed functions cannot be used in the context of another windowed
function or aggregate.
This was an interesting problem. I'd be curious how this solution performs against your large dataset even though the actual result is incorrect.