Calculation in Sql Server

2019-03-08 13:54发布

问题:

I trying to perform following calculation

Sample data:

CREATE TABLE #Table1
  (
     rno   int identity(1,1),
     ccp   varchar(50),
     [col1] INT,
     [col2] INT,
     [col3] INT,
     col4 as [col2]/100.0
  );

INSERT INTO #Table1
            (ccp,[col1],[col2],[col3])
VALUES      ('ccp1',15,10,1100),
            ('ccp1',20,10,1210),
            ('ccp1',30,10,1331),
            ('ccp2',10,15,900),
            ('ccp2',15,15,1000),
            ('ccp2',20,15,1010)

+-----+------+------+------+------+----------+
| rno | ccp  | col1 | col2 | col3 |   col4   |
+-----+------+------+------+------+----------+
|   1 | ccp1 |   15 |   10 | 1100 | 0.100000 |
|   2 | ccp1 |   20 |   10 | 1210 | 0.100000 |
|   3 | ccp1 |   30 |   10 | 1331 | 0.100000 |
|   4 | ccp2 |   10 |   15 |  900 | 0.150000 |
|   5 | ccp2 |   15 |   15 | 1000 | 0.150000 |
|   6 | ccp2 |   20 |   15 | 1010 | 0.150000 |
+-----+------+------+------+------+----------+

Note : It is not just 3 records each ccp can have N no.of records

Expected Result :

1083.500000 --1100 - (15 * (1+0.100000))
1169.850000 --1210 - ((20 * (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000)) )
1253.835000 --1331 - ((30 * (1+0.100000)) + (20 * (1+0.100000)* (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000) *(1+0.100000)) )
888.500000  --900 - (10 * (1+0.150000))
969.525000  --1000 - ((15 * (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000)) )
951.953750  --1010 - ((20 * (1+0.150000)) + (15 * (1+0.150000)* (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000) *(1+0.150000)) )

I know we can do this using Recursive CTE, it is not efficient since i have to do this for more than 5 million records.

I am looking to implement something like this set based approach

For ccp : ccp1

SELECT col3 - ( col1 * ( 1 + col4 ) )
FROM   #Table1
WHERE  rno = 1

SELECT rno,
       col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
                                                        OVER(
                                                          ORDER BY rno ) * Power(( 1 + col4 ), 2) ) )
FROM   #Table1
WHERE  rno IN ( 1, 2 )

SELECT rno,
       col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
                                                        OVER(
                                                          ORDER BY rno ) * Power(( 1 + col4 ), 2) ) + ( Lag(col1, 2)
                                                                                                          OVER(
                                                                                                            ORDER BY rno ) * Power(( 1 + col4 ), 3) ) )
FROM   #Table1
WHERE  rno IN ( 1, 2, 3 ) 

Is there a way to calculate in single query?

Update :

Still am open to suggestions. I strongly beleive there should be some to do this using SUM () Over(Order by) window aggregate function.

回答1:

An approach with a self join. Not sure if this would be any more efficient than your version with cross apply.

WITH T AS
  (SELECT *,
          ROW_NUMBER() OVER(PARTITION BY CCP
                            ORDER BY RNO) AS RN
   FROM #TABLE1)
SELECT T1.RNO,
       T1.CCP,
       T1.COL1,
       T1.COL2,
       T1.COL3,
       T1.COL3-SUM(T2.COL1*POWER(1+T1.COL2/100.0,T1.RN-T2.RN+1)) AS RES
FROM T T1
JOIN T T2 ON T1.CCP=T2.CCP
AND T1.RN>=T2.RN
GROUP BY T1.RNO,
         T1.CCP,
         T1.COL1,
         T1.COL2,
         T1.COL3

Sample Demo



回答2:

Finally I achieved the result using below approach

SELECT a.*,
       col3 - res AS Result
FROM   #TABLE1 a
       CROSS apply (SELECT Sum(b.col1 * Power(( 1 + b.COL2 / 100.00 ), new_rn)) AS res
                    FROM   (SELECT Row_number()
                                     OVER(
                                       partition BY ccp
                                       ORDER BY rno DESC) new_rn,*
                            FROM   #TABLE1 b
                            WHERE  a.ccp = b.ccp
                                   AND a.rno >= b.rno)b) cs

Result :

+-----+------+------+------+------+----------+-------------+
| rno | ccp  | col1 | col2 | col3 |   col4   |   Result    |
+-----+------+------+------+------+----------+-------------+
|   1 | ccp1 |   15 |   10 | 1100 | 0.100000 | 1083.500000 |
|   2 | ccp1 |   20 |   10 | 1210 | 0.100000 | 1169.850000 |
|   3 | ccp1 |   30 |   10 | 1331 | 0.100000 | 1253.835000 |
|   4 | ccp2 |   10 |   15 |  900 | 0.150000 | 888.500000  |
|   5 | ccp2 |   15 |   15 | 1000 | 0.150000 | 969.525000  |
|   6 | ccp2 |   20 |   15 | 1010 | 0.150000 | 951.953750  |
+-----+------+------+------+------+----------+-------------+


回答3:

This answer may be disappointing but you'll likely find that an iterative CLR approach performs competitively with any TSQL approach.

Try the following (based on Running sums yet again: SQLCLR saves the day!)

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void StackoverflowQuestion41803909()
    {
        using (SqlConnection conn = new SqlConnection("context connection=true;"))
        {
            SqlCommand comm = new SqlCommand();
            comm.Connection = conn;
            comm.CommandText = @"
SELECT [rno],
       [ccp],
       [col1],
       [col2],
       [col3],
       [col4]
FROM   Table1
ORDER  BY ccp,
          rno 
";

            SqlMetaData[] columns = new SqlMetaData[7];
            columns[0] = new SqlMetaData("rno", SqlDbType.Int);
            columns[1] = new SqlMetaData("ccp", SqlDbType.VarChar, 50);
            columns[2] = new SqlMetaData("col1", SqlDbType.Int);
            columns[3] = new SqlMetaData("col2", SqlDbType.Int);
            columns[4] = new SqlMetaData("col3", SqlDbType.Int);
            columns[5] = new SqlMetaData("col4", SqlDbType.Decimal, 17, 6);
            columns[6] = new SqlMetaData("result", SqlDbType.Decimal, 17, 6);

            SqlDataRecord record = new SqlDataRecord(columns);

            SqlContext.Pipe.SendResultsStart(record);

            conn.Open();

            SqlDataReader reader = comm.ExecuteReader();

            string prevCcp = null;
            decimal offset = 0;

            while (reader.Read())
            {
                string ccp = (string)reader[1];
                int col1 = (int)reader[2];
                int col3 = (int)reader[4];
                decimal col4 = (decimal)reader[5];

                if (prevCcp != ccp)
                {
                    offset = 0;
                }

                offset = ((col1 + offset) * (1 + col4));
                record.SetInt32(0, (int)reader[0]);
                record.SetString(1, ccp);
                record.SetInt32(2, col1);
                record.SetInt32(3, (int)reader[3]);
                record.SetInt32(4, col3);
                record.SetDecimal(5, col4);
                record.SetDecimal(6, col3 - offset);

                SqlContext.Pipe.SendResultsRow(record);

                prevCcp = ccp;
            }

            SqlContext.Pipe.SendResultsEnd();
        }
    }
};


回答4:

Another option

CREATE TABLE #Table1
  (
     rno   int identity(1,1),
     ccp   varchar(50),
     [col1] INT,
     [col2] INT,
     [col3] INT,
     col4 as [col2]/100.0
  );

INSERT INTO #Table1
            (ccp,[col1],[col2],[col3])
VALUES      ('ccp1',15,10,1100),
            ('ccp1',20,10,1210),
            ('ccp1',30,10,1331),
            ('ccp1',40,10,1331),
            ('ccp2',10,15,900),
            ('ccp2',15,15,1000),
            ('ccp2',20,15,1010);

select t.*, col3-s
from(
    select *, rn = row_number() over(partition by ccp order by rno)
    from #Table1
) t
cross apply (
    select s=sum(pwr*col1)
    from(
        select top(rn)
           col1, pwr = power(1+col4, rn + 1 - row_number() over(order by rno))
        from #Table1 t2
        where t2.ccp=t.ccp
        order by row_number() over(order by rno)
        )t3
    )t4
order by rno;


回答5:

Try this:

;with 
    val as (
        select 
            *, 
            (1 + col2 / 100.00) val,
            row_number() over(partition by ccp order by rno desc) rn
        from #Table1),
res as (
        select 
            v1.rno, 
            --min(v1.ccp) ccp,
            --min(v1.col1) col1, 
            --min(v1.col2) col2, 
            min(v1.col3) col3, 
            sum(v2.col1 * power(v2.val, 1 + v2.rn - v1.rn)) sum_val
        from val v1
        left join val v2 on v2.ccp = v1.ccp and v2.rno <= v1.rno
        group by v1.rno)
select *, col3 - isnull(sum_val, 0)
from res

But performance depends on indexes. Post index structure for details. Best performance can be achieved when you will split it into more temporary tables.



回答6:

After playing with it for some time I believe the answer to the bounty question of whether or not this can be done with a sum() over (order by) is NO. This code is as close as I could get:

select  *, col3 - sum(col1 * power(1 + col4, row_num)) over (partition by ccp order by col1)
from    (
        select  *, row_number() over (partition by ccp order by rno asc) row_num
        from    @Table1
        ) a
order   by 1,2;

This will return correct results for the first row in each ccp group. By calculating row_num using rno desc instead then the final row in each ccp will be correct.

It appears that the only ways to get this to work in the simple way that the syntax suggests would be:

  1. Syntax support to reference the actual row within the aggregate over function. This does exist in T-SQL as far as I can find.
  2. Syntax support for a window function within a window function. This also is not permitted in T-SQL per the following error:

Windowed functions cannot be used in the context of another windowed function or aggregate.

This was an interesting problem. I'd be curious how this solution performs against your large dataset even though the actual result is incorrect.