I am using SQL Server 2012.
I want to create a row_number based on whether the index in subsequent rows are increasing by 1 or more. For example, say I have a table that looks like:
event row_index
1 24
2 25
3 26
4 30
5 31
6 42
7 43
8 44
9 45
Then what I want to do is create a column at the end, called seq_ID:
event row_index seq_id
1 24 1
2 25 1
3 26 1
4 30 2
5 31 2
6 42 3
7 43 3
8 44 3
9 45 3
basically, the seq_id only chances if the difference between subsequent row indexes is > 1. I have tried to use:
rank() over (partition by 'not sure what to put here' )
Try this:
;with cte as
(select *,
coalesce(row_index - (lag(row_index) over (order by event)),1) diff
from tbl
),
cte2 as
(select *,
(select max(diff)
from cte c
where c.row_index <= d.row_index
) minri
from cte d
)
select event, row_index, minri,
dense_rank() over (order by minri) rn
from cte2
- The first CTE gets the differences using the
lag
function
(available from SQL Server 2012 onwards).
- The next CTE calculates
when the difference exceeds 1 and assigns all records after that
point to a 'group', until the next difference <> 1 is found. This is
the key step in grouping.
- The last step is to use
dense_rank
over
the indicator calculated in the previous step to get the row numbers
as required.
This solution does have a limitation in that it will fail if the differences are not in increasing order i.e. if you have two more values in the sample data such as 52 and 53, it will classify them into group 3 instead of creating a new group.
Demo
Update: The below approach can overcome the above limitation:
;with cte as
(select *,
coalesce(row_index - (lag(row_index) over (order by event)),1) diff
from tbl)
,cte2 as
(select *,
diff - coalesce(diff - (lag(diff) over (order by event)),0) tmp
from cte d)
select event,row_index,
1 + sum(case when tmp >= diff then 0 else 1 end) over (order by event) risum
from cte2
Again the first step remains the same. But in step 2, we check only for transition to a different value of the difference between successive values, instead of using a min/max function. The ranking then uses a conditional sum to assign a group for each value in the original data.
Demo
This can be further simplified to:
select event, row_index,
sum(case when diff <= 1 then 0 else 1 end) over (order by event) as rb
from
(select *,
row_index - (lag(row_index) over (order by event)) diff
from tbl
) s
With a huge number of records it may performs badly, but give it a try:
;with T as (
select event, rowindex, rowindex-row_number() over (order by event) as rn from YourTableName
)
select event, (select count(distinct rn)+1 from T where rn<TT.rn) as seq_id
from T TT order by event