我有一个表日的交易包含日期,客户数,交易类型和值的列表。 我需要返回给定客户的所有交易,如果客户有一排3个或更多个连续的“现金”交易时,按日期排序。
因此,在样本数据下面我想回到所有的交易(包括信贷的),为客户1和3既是客户在一排3个或更多的现金交易。 因为即使他们有3个以上的现金交易,他们不是连续的客户2被忽略。
╔════════════╦════════════╦═══════════╦═══════╗
║ Customer ║ Date ║ TransType ║ Value ║
╠════════════╬════════════╬═══════════╬═══════╣
║ Customer 1 ║ 1/01/2015 ║ cash ║ 23.00 ║
║ Customer 1 ║ 2/01/2015 ║ cash ║ 24.00 ║
║ Customer 2 ║ 2/01/2015 ║ cash ║ 28.00 ║
║ Customer 2 ║ 4/01/2015 ║ credit ║ 29.00 ║
║ Customer 3 ║ 5/01/2015 ║ credit ║ 27.00 ║
║ Customer 2 ║ 6/01/2015 ║ cash ║ 23.00 ║
║ Customer 2 ║ 8/01/2015 ║ credit ║ 24.00 ║
║ Customer 3 ║ 9/01/2015 ║ cash ║ 28.00 ║
║ Customer 3 ║ 13/01/2015 ║ cash ║ 29.00 ║
║ Customer 1 ║ 15/01/2015 ║ cash ║ 25.00 ║
║ Customer 1 ║ 17/01/2015 ║ credit ║ 26.00 ║
║ Customer 3 ║ 18/01/2015 ║ cash ║ 23.00 ║
║ Customer 1 ║ 20/01/2015 ║ cash ║ 27.00 ║
║ Customer 3 ║ 20/01/2015 ║ credit ║ 24.00 ║
║ Customer 2 ║ 21/01/2015 ║ cash ║ 25.00 ║
║ Customer 3 ║ 22/01/2015 ║ credit ║ 25.00 ║
║ Customer 2 ║ 23/01/2015 ║ cash ║ 26.00 ║
╚════════════╩════════════╩═══════════╩═══════╝
您可以使用一个技巧来枚举“现金”交易。 这招行号的差异,这是非常有用的:
select t.*
from (select t.*, count(*) over (partition by grp, customerid, transtype) as cnt
from (select t.*,
(row_number() over (partition by customerid order by date) -
row_nubmer() over (partition by customerid, transtype order by date)
) as grp
from t
) t
where transtype = 'cash'
) t
where cnt >= 3;
这将返回客户和起始日期。 如果你想返回实际发生的交易,你可以使用窗口功能的添加量:
select customerid, min(date) as start_date, sum(value) as sumvalue
from (select t.*,
(row_number() over (partition by customerid order by date) -
row_nubmer() over (partition by customerid, transtype order by date)
) as grp
from t
) t
where transtype = 'cash'
group by grp, transtype, customerid
having count(*) >= 3;
因此,要获得具有您可以使用自至少三个连续的现金交易客户的加盟,为每一行前后和测试连接行,如果三者transtype现金。
作为第一个公共表表达式的数字按客户划分的所有行,所以我们有一个合适的列通过将它们连接的查询。 该连接然后在第二公用表表达式制成,并从该结果被馈送到最终的查询。 查询可以缩短,但我把它的时间长一点的清晰度。
with cte as (
select *, r = row_number() over (partition by customer order by date)
from table1 -- this is your source table
), cte2 as (
select t1.customer
from cte t1
join cte t2 on t1.customer = t2.customer and (t1.r = t2.r-1 or t1.r = t2.r+1)
where t1.transtype = 'cash' and t2.transtype = 'cash'
group by t1.customer
having count(*) >= 3
)
select * from Table1 -- this is your source table
where Customer in (select Customer from cte2)
order by customer, date;
随着您的样本数据,这将返回客户1和3的所有行。
示例SQL小提琴