SQL Server : how many days each item was in each s

2019-09-17 21:38发布

问题:

Given a table that stores every revision for every item.

For example:

+--------+----------+---------------+--------+---------------------+
| ItemId | Revision | PreviousState | State  |     DateChanged     |
+--------+----------+---------------+--------+---------------------+
|      1 |        1 | NULL          | New    | 2014-11-13 10:00:00 |
|      1 |        2 | New           | Active | 2014-11-15 10:00:00 |
|      1 |        3 | Active        | New    | 2014-11-17 10:00:00 |
|      1 |        4 | New           | Active | 2014-11-19 10:00:00 |
|      1 |        5 | New           | Active | 2014-11-20 10:00:00 |
|      1 |        6 | Active        | Closed | 2014-11-22 10:00:00 |
|      2 |        1 | NULL          | New    | 2014-11-13 10:00:00 |
|      2 |        2 | New           | Active | 2014-11-16 10:00:00 |
|      2 |        3 | Active        | Closed | 2014-11-17 10:00:00 |
|      2 |        4 | Closed        | Active | 2014-11-19 10:00:00 |
|      2 |        5 | Active        | Closed | 2014-11-21 10:00:00 |
+--------+----------+---------------+--------+---------------------+

I need to calculate how many days each item was in each state (except 'Close').

Result should be like this:

+--------+-----+--------+
| ItemId | New | Active |
+--------+-----+--------+
|      1 |   4 |      5 |
|      2 |   3 |      3 |
+--------+-----+--------+

I tried to use two approaches - GROUP BY and nested cursors.

Using cursors (especially nested cursors) is a bad practice. And the are very slow.

GROUP BY also won't work because there is no strict order of states (New -> Active -> Closed). It could be chaotic New -> Active -> Closed -> Active -> Closed -> New -> Closed.

I don't see any other way to calculate it without iterating all the records and comparing states.

Is there any solution?

Thanks in advance.

回答1:

This gives you the same results you're asking for, in a slightly different format (but you can easily find PIVOT solutions if you need the exact same result set):

declare @t table (ItemId int,Revision int,State varchar(19),DateChanged datetime2)
insert into @t(ItemId,Revision,State,DateChanged) values
(1,1,'New',   '2014-11-13T10:00:00'),
(1,2,'Active','2014-11-15T10:00:00'),
(1,3,'New',   '2014-11-17T10:00:00'),
(1,4,'Active','2014-11-19T10:00:00'),
(1,5,'Active','2014-11-20T10:00:00'),
(1,6,'Closed','2014-11-22T10:00:00'),
(2,1,'New',   '2014-11-13T10:00:00'),
(2,2,'Active','2014-11-16T10:00:00'),
(2,3,'Closed','2014-11-17T10:00:00'),
(2,4,'Active','2014-11-19T10:00:00'),
(2,5,'Closed','2014-11-21T10:00:00')

;With Joined as (
    select t1.ItemId,t1.State,DATEDIFF(day,t1.DateChanged,t2.DateChanged) as Days
    from
        @t t1
            inner join
        @t t2
            on
                t1.ItemId = t2.ItemId and
                t1.Revision = t2.Revision -1
    )
select ItemId,State,SUM(Days)
from Joined
where State <> 'Closed'
group by ItemId,State

Result:

ItemId      State               
----------- ------------------- -----------
1           Active              5
1           New                 4
2           Active              3
2           New                 3

Note that I'm ignoring the PreviousState column from your question and am instead constructing Joined because what really matters is when the next state came into effect.


Issues not dealt with because you've not described them in your question: 1) What to do if the current final state isn't Closed - i.e. do we ignore that, or count until today?, and 2) What to do if the time of day for each DateChanged isn't the same - do we have to handle partial days?



回答2:

Personally I like the CTE from [Damien_The_Unbeliever], I need to use them more often. Using inner joins I basically do the same thing the add a pivot wrapper around the results to get what you are looking for: (replace @t for your real table name)

SELECT ItemId , [New],[Active]
FROM
(
SELECT 
    ItemId , LASTSTATE,  DATEDIFF(D, LASTDATE, DateChanged) AS D 
FROM 
    @T AS T 
    INNER JOIN 
        (SELECT 
            ItemId as ItemLink, 
            Revision + 1 AS RevLink , 
            DateChanged AS LASTDATE , 
            State AS LASTSTATE from @t
        ) AS L ON T.ItemId = L.ItemLink AND T.Revision = L.RevLink
) AS P PIVOT ( SUM(D) FOR LASTSTATE IN ([New],[Active],[Closed])) AS DATA