-->

SQL Implementation with Sliding-Window functions o

2019-08-13 15:06发布

问题:

I have a problem that it's very easy to be solved in C# code for example, but I have no idea how to write in a SQL query with Recursive CTE-s or Sliding-Windows functions.

Here is the situation: let's say I have a table with 3 columns (ID, Date, Amount), and here is some data:

ID  Date         Amount
-----------------------
1   01.01.2016    -500
2   01.02.2016    1000
3   01.03.2016    -200
4   01.04.2016     300
5   01.05.2016     500
6   01.06.2016    1000
7   01.07.2016    -100
8   01.08.2016     200

The result I want to get from the table is this (ID, Amount .... Order By Date):

ID  Amount
-----------------------
2    300
4    300
5    500
6    900
8    200

The idea is to distribute the amounts into installments, for each client separately, but the thing is when negative amount comes into play you need to remove amount from the last installment. I don't know how clear I am, so here is an example:

Let's say I have 3 Invoices for one client with amounts 500, 200, -300.

If i start distribute these Invoices, first i distribute the amount 500, then 200. But when i come to the third one -300, then i need to remove from the last Invoice. In other words 200 - 300 = -100, so the amount from second Invoice will disappear, but there are still -100 that needs to be substracted from first Invoice. So 500 - 100 = 400. The result i need is result with one row (first invoice with amount 400)

Another example when the first invoice is with negative amount (-500, 300, 500). In this case, the first (-500) invoice will make the second disappear and another 200 will be substracted from the third. So the result will be: Third Invoice with amount 300.

This is something like Stack implementation in programming language, but i need to make it with sliding-window functions in SQL Server.

I need an implementation with Sliding Function or Recursive CTEs. Not with cycles ...

Thanks.

回答1:

Ok, think this is what you want. there are two recursive queries. One for upward propagation and the second one for the downward propagation.

with your_data_rn as
(
   select *, row_number() over (order by date) rn
   from your_data
), rcte_up(id, date, ammount, rn, running) as
(
   select *, ammount as running 
   from your_data_rn 
   union all
   select d.*, 
          d.ammount + rcte_up.running
   from your_data_rn d
   join rcte_up on rcte_up.running < 0 and  d.rn = rcte_up.rn - 1
), data2 as
(
   select id, date, min(running) ammount, 
          row_number() over (order by date) rn
   from rcte_up
   group by id, date, rn
   having min(running) > 0 or rn = 1
), rcte_down(id, date, ammount, rn, running) as
(
   select *, ammount as running 
   from data2 
   union all
   select  d.*, d.ammount + rcte_down.running
   from data2 d
   join rcte_down on rcte_down.running < 0 and  d.rn = rcte_down.rn + 1
)
select id, date, min(running) ammount
from rcte_down
group by id, date
having min(running) > 0

demo

I can imagine that you use just the upward propagation and the downward propagation of the first row is done in some procedural language. Downward propagation is one scan through few first rows, therefore, the recursive query may be a hammer on a mosquito.



回答2:

I add client ID in table for more general solution. Then I implemented the stack stored as XML in query field. And emulated a program cycle with Recursive-CTE:

with Data as(  -- Numbering rows for iteration on CTE
    select Client, id, Amount,
           cast(row_number() over(partition by Client order by Date) as int) n
      from TabW
),
CTE(Client, n, stack) as( -- Recursive CTE 
    select Client, 1, cast(NULL as xml) from Data where n=1
  UNION ALL
    select D.Client, D.n+1, (
-- Stack operations to process current row (D)
select row_number() over(order by n) n,
       -- Use calculated amount in first positive and oldest stack cell
       -- Else preserve value stored in stack
       case when n=1 or (n=0 and last=1) then new else Amount end Amount,
       -- Set ID in stack cell for positive and new data
       case when n=1 and D.Amount>0 then D.id else id end id
  from (
    select Y.Amount, Y.id, new,
           -- Count positive stack entries
           sum(case when new<=0 or (n=0 and Amount<0) then 0 else 1 end) over (order by n) n,
           row_number() over(order by n desc) last  -- mark oldest stack cell by 1
      from (
        select X.*,sum(Amount) over(order by n) new
          from (
                select case when C.stack.value('(/row/@Amount)[1]','int')<0 then -1 else 0 end n,
                       D.Amount, D.id  -- Data from new record
              union all  -- And expand current stack in XML to table
            select node.value('@n','int') n, node.value('@Amount','int'), node.value('@id','int')
              from C.stack.nodes('//row') N(node)
          ) X
      ) Y where n>=0 -- Suppress new cell if the stack contained a negative amount
 ) Z
where n>0 or (n=0 and last=1)
  for xml raw, type

           )
      from Data D, CTE C
     where D.n=C.n and D.Client=C.Client
) -- Expand stack into result table
select CTE.Client, node.value('@id','int') id, node.value('@Amount','int')
  from CTE join (select Client, max(n) max_n from Data group by Client) X on CTE.Client=X.Client and CTE.n=X.max_n+1
 cross apply stack.nodes('//row') N(node)
 order by CTE.Client, node.value('@n','int') desc

Test on sqlfiddle.com

I think this method is slower than @RadimBača. And it is shown to demonstrate the possibilities of implementing a sequential algorithm on SQL.