I have a list of dated transactions in a table containing a date, customer number, transaction type and value. I need to return all the transactions of a given customer if that customer has 3 or more consecutive 'cash' transactions in a row when sorted by date.
So in the sample data below I want to return all the transactions (including credit ones) for customers 1 and 3 as both customers has 3 or more cash transactions in a row. Customer 2 is ignored because even though they had more than 3 cash transactions, they weren't consecutive.
╔════════════╦════════════╦═══════════╦═══════╗
║ Customer ║ Date ║ TransType ║ Value ║
╠════════════╬════════════╬═══════════╬═══════╣
║ Customer 1 ║ 1/01/2015 ║ cash ║ 23.00 ║
║ Customer 1 ║ 2/01/2015 ║ cash ║ 24.00 ║
║ Customer 2 ║ 2/01/2015 ║ cash ║ 28.00 ║
║ Customer 2 ║ 4/01/2015 ║ credit ║ 29.00 ║
║ Customer 3 ║ 5/01/2015 ║ credit ║ 27.00 ║
║ Customer 2 ║ 6/01/2015 ║ cash ║ 23.00 ║
║ Customer 2 ║ 8/01/2015 ║ credit ║ 24.00 ║
║ Customer 3 ║ 9/01/2015 ║ cash ║ 28.00 ║
║ Customer 3 ║ 13/01/2015 ║ cash ║ 29.00 ║
║ Customer 1 ║ 15/01/2015 ║ cash ║ 25.00 ║
║ Customer 1 ║ 17/01/2015 ║ credit ║ 26.00 ║
║ Customer 3 ║ 18/01/2015 ║ cash ║ 23.00 ║
║ Customer 1 ║ 20/01/2015 ║ cash ║ 27.00 ║
║ Customer 3 ║ 20/01/2015 ║ credit ║ 24.00 ║
║ Customer 2 ║ 21/01/2015 ║ cash ║ 25.00 ║
║ Customer 3 ║ 22/01/2015 ║ credit ║ 25.00 ║
║ Customer 2 ║ 23/01/2015 ║ cash ║ 26.00 ║
╚════════════╩════════════╩═══════════╩═══════╝
You can use a trick to enumerate the "cash" transactions. This trick is a difference of row numbers and it is very useful:
select t.*
from (select t.*, count(*) over (partition by grp, customerid, transtype) as cnt
from (select t.*,
(row_number() over (partition by customerid order by date) -
row_nubmer() over (partition by customerid, transtype order by date)
) as grp
from t
) t
where transtype = 'cash'
) t
where cnt >= 3;
This returns the customers and the start date. If you want to return the actual transactions, you can use an addition level of window functions:
select customerid, min(date) as start_date, sum(value) as sumvalue
from (select t.*,
(row_number() over (partition by customerid order by date) -
row_nubmer() over (partition by customerid, transtype order by date)
) as grp
from t
) t
where transtype = 'cash'
group by grp, transtype, customerid
having count(*) >= 3;
So to get the customers that have at least three consecutive cash transactions you can use a self join and for each row connect the row before and after and test if all three are transtype cash.
The query used as the first common table expression numbers all rows partitioned by customer so we have a suitable column to connect them by. The connection is then made in the second common table expression and the result from that is fed into the final query. The query could be shortened, but I left it a bit longer for clarity.
with cte as (
select *, r = row_number() over (partition by customer order by date)
from table1 -- this is your source table
), cte2 as (
select t1.customer
from cte t1
join cte t2 on t1.customer = t2.customer and (t1.r = t2.r-1 or t1.r = t2.r+1)
where t1.transtype = 'cash' and t2.transtype = 'cash'
group by t1.customer
having count(*) >= 3
)
select * from Table1 -- this is your source table
where Customer in (select Customer from cte2)
order by customer, date;
With your sample data this would return all rows for customers 1 and 3.
Sample SQL Fiddle