使得基于列值在SQL亚组聚合(making subgroup aggregations in sql

2019-10-22 18:58发布

我想在我们的数据从现有的记录子记录聚合。 就拿这个例子中的数据:

Index   Date    Action
1   1/1/2015    Working
2   1/2/2015    Working
3   1/3/2015    Working
4   1/4/2015    Escalated
5   1/5/2015    Done
6   1/6/2015    Working
7   1/7/2015    Done
8   1/8/2015    Working
9   1/9/2015    Working
10  1/10/2015   Working
11  1/11/2015   Escalated
12  1/12/2015   Done
13  1/13/2015   Done
14  1/14/2015   Working  

我希望能够创造这样的数据:

Record  DateBegin   DateEnd #Actions    #Escalations
A   1/1/2015        1/5/2015    5       1
B   1/6/2015        1/7/2015    2       0
C   1/8/2015        1/12/2015   5       1
D   1/13/2015       1/13/2015   1       0
E   1/14/2015       null        1       0

基本上,逻辑是,当操作值=“完成”的子记录结束,并且一个新的子记录开始于任何后续行动(以及也非常第一动作)。 我与SQL Server 2008工作感谢您的帮助!

Answer 1:

您可以通过计算分配号码分组参数done每个记录之前的记录。 剩下的只是聚集,虽然分配每个组的信似乎是不必要的并发症:

select grp as record, min(Date) as DateBegin,
       max(case when Action = 'Done' then Date end) as DateEnd,
       count(*) as NumActions,
       sum(case when Action = 'Escalation' then 1 else 0 end) as NumEscalations
from (select e.*, coalesce(e2.grp, 0) as grp
      from example e outer apply
           (select count(*) as grp
            from example e2
            where e2.id < e.id and e2.Action = 'Done'
           ) e2
     ) e
group by grp;

该查询将在SQL Server 2012+,它支持累积和简单(和更有效)。

编辑:

我注意到,我使用子查询这一点,但是这是没有必要的。 这可以写成:

      select coalesce(grp, 0) as record, min(Date) as DateBegin,
             max(case when Action = 'Done' then Date end) as DateEnd,
             count(*) as NumActions,
             sum(case when Action = 'Escalation' then 1 else 0 end) as NumEscalations
      from example e outer apply
           (select count(*) as grp
            from example e2
            where e2.id < e.id and e2.Action = 'Done'
           ) e2
      group by e2.grp


Answer 2:

这是一个设计糟糕的表; 应该有一个的TaskID识别特定的任务(如某人前一个结束之前开始工作这个结构将变得无法使用,尽快)。

我会:

  1. 创建一个临时表,索引,日期,操作,和的TaskID

  2. 写光标通过旧表,通过指数排序进行迭代。 有一个局部变量CurrentTaskID(初始化为1)。 对于每个记录读,写指数,日期,操作,以及CurrentTaskID到新表。 之后的每个写,看动作 - 如果=“完成”,然后递增CurrentTaskID

  3. 在临时表上编写一个查询类似:

    SELECT分钟(a.date),MAX(a.date),COUNT(*)
    ,(SELECT COUNT(*)FROM MyTempTable B其中b.TaskID = a.TaskID AND b.Action = '已上报')
    FROM MyTempTable一个GROUP BY a.TaskID



Answer 3:

有趣的问题。 试试这个,应该工作。

;with dones as (
    select
        nn = ROW_NUMBER() over(order by [Date])
        ,*
    from YourTable
    where action = 'Done'

), ranges as (
    select
        [Record] = CHAR(ASCII('A') - 1 + ISNULL(d2.nn, d1.nn + 1))  --  A,B,C,... - after 255 will give NULL
        ,dtFrom = d1.[Date]
        ,dtTo = d2.[Date]
    from dones d1
    full join dones d2 on d1.nn = d2.nn + 1

)
    select 
        dones.[Record]
        ,DateBegin  = MIN(tt.[date]) 
        ,DateEnd = dones.dtTo
        ,[#Actions] = COUNT(tt.*)
        ,[#Escalations] = SUM(case when tt.Action = 'Escalated' then 1 else 0 end)
    from YourTable tt
    inner join dones 
        on (dones.dtFrom is null or tt.[date] > dones.dtFrom ) 
        and (dones.dtTo is null or tt.[date] <= dones.dtFrom)
    group by dones.[Record], dones.dtTo;


文章来源: making subgroup aggregations in sql based on a column value