Missing gaps in recurring series within a group

2019-03-05 05:45发布

问题:

We have a table with following data

Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00

Id is an identity column. The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-

ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2

We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.

In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.

We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.

How can we accomplish this in a t-sql?

回答1:

Could work like this:

;WITH b AS (
   SELECT *
         ,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
   FROM   tbl
   ), x AS (
   SELECT
       b.Id
      ,b.ItemId    AS prev_Itm
      ,b.SeqNumber AS prev_Seq
      ,c.ItemId    AS next_Itm
      ,c.SeqNumber AS next_Seq
   FROM   b
   JOIN   b c ON c.rn = b.rn + 1                -- next row
   WHERE  c.ItemId = b.ItemId                   -- only with same ItemId
   AND    c.SeqNumber <> (b.SeqNumber + 1)%256  -- Seq cycles modulo 256
   )
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM   x
UNION  ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM   x
ORDER  BY Id, PrevNext DESC

Produces exactly the requested result.
See a complete working demo on data.SE.

This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.


Edit2: Answer to updated question:

I updated the CTE in the query above to match your latest verstion - or so I think.

Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.

The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.



标签: sql tsql