Detect gaps over 30 min in timestamp column

I have read up on and attempted using the standard method of gaps and island detection in a series with no success because I need to be able to ignore any gaps less than 30 minutes. I can not use a cursor due to performance issues.

Everytime there is a gap of at least 30 min, I need a new row with the start and end. If there are no gaps of at least 30, result would be one row with the min and max of the timestamps. If there is 1 gap of at least 30, there would be 2 rows - from the start of the series to the gap and from the gap to the end. If there are more gaps, we get rows for each interval between the gaps, etc.

Input:

timestamp 

2015-07-15 15:01:21 
2015-07-15 15:17:44 
2015-07-15 15:17:53 
2015-07-15 15:18:34 
2015-07-15 15:21:41 
2015-07-15 15:58:12 
2015-07-15 15:59:12 
2015-07-15 16:05:12
2015-07-15 17:02:12

Desired output :

from | to

2015-07-15 15:01:21 | 2015-07-15 15:21:41 
2015-07-15 15:58:12 | 2015-07-15 16:05:12
2015-07-15 17:02:12 | 2015-07-15 17:02:12

标签： sql sql-server sql-server-2008 tsql

4条回答

神经病院院长

2楼-- · 2019-09-20 05:31

Easy solution using common table expression. Compare with cursor performance if you have at least 1000 rows.

create table #tmp (Dt datetime)

insert into #tmp values 
('2015-07-15 15:01:21'),
('2015-07-15 15:17:44'), 
('2015-07-15 15:17:53'), 
('2015-07-15 15:18:34'), 
('2015-07-15 15:21:41'), 
('2015-07-15 15:58:12'), 
('2015-07-15 15:59:12'), 
('2015-07-15 16:05:12'),
('2015-07-15 17:02:12')

;with tbl as (
select dt, row_number() over(order by dt) rn
from #tmp
)
select t1.dt [from],t2.dt [to], datediff(minute,t1.dt,t2.dt) gap
from tbl t1 
inner join tbl t2 on t1.rn+1 = t2.rn
where datediff(minute,t1.dt,t2.dt) >30

0人赞添加讨论(0) 举报

霸刀☆藐视天下

3楼-- · 2019-09-20 05:35

;with cc as (
  select Dt 
    , (select top 1 Dt from #tmp where Dt > ot.Dt order by Dt) as NextDt
  from #tmp ot
)
select Dt AS [FROM], NextDt AS [TO]
from cc
where DATEDIFF(minute, Dt, NextDt) >= 30
order by Dt

Test

create table #tmp (Dt datetime)

insert into #tmp values 
('2015-07-15 15:01:21'),
('2015-07-15 15:17:44'), 
('2015-07-15 15:17:53'), 
('2015-07-15 15:18:34'), 
('2015-07-15 15:21:41'), 
('2015-07-15 15:58:12'), 
('2015-07-15 15:59:12'), 
('2015-07-15 16:05:12'),
('2015-07-15 17:02:12')

Result

FROM                           TO
2015-07-15 15:21:41.000        2015-07-15 15:58:12.000
2015-07-15 16:05:12.000        2015-07-15 17:02:12.000

_{If a post answers your question, please Mark it as answer}

0人赞添加讨论(0) 举报

聊天终结者

4楼-- · 2019-09-20 05:39

;with Boundaries as (
    select
        "timestamp" as Stamp,
        coalesce(
            case when datediff(second, prev_timestamp, "timestamp") >= 1800 
                then 1 else 0 end,
            1
        ) as IsBoundary
    from
        T t cross apply (
            select max(t2."timestamp") as prev_timestamp from T t2
            where t2."timestamp" < t."timestamp"
        ) as n
), Blocks as (
    select Stamp, sum(IsBoundary) over (order by Stamp) as BlockNum
    from Boundaries
)
select min(Stamp) as "from", max(Stamp) as "to"
from Blocks
group by BlockNum

Remember that datediff() just counts interval boundaries so you need to take some care with the time difference. I'm using 1800 seconds here.

If you had a later edition of SQL Server you could use lead()/lag() to look for the gaps. But instead of an inner join the cross apply will hopefully be significantly faster.

I think your title "find gaps over 30 minutes" as well as a comment to "ignore gaps less than 30 minutes" has misled some posters who thought that you were only seeking to find the rows that bounded a gap of 30 minutes. This is the only one that addresses the problem according to your desired output. (Test here.)

Instead of using analytic sum() over (order by...) it's a simple matter of replacing it with a scalar subquery.

...
), Blocks as (
    select
      Stamp,
      (select sum(b2.IsBoundary) from Boundaries b2 where b2.Stamp <= b.Stamp) as BlockNum
    from Boundaries b
) ...

0人赞添加讨论(0) 举报

欢心

5楼-- · 2019-09-20 05:44

An approach like the one below would seem to fit the bill assuming the performance is okay. I used the sys.all_tables catalog view to simulate a log table like the one in your example. You can vary the first argument to datediff to get different results.

WITH [Sequenced_Entries] AS
(
    SELECT
        row_number() OVER(ORDER BY [modify_date] ASC) AS 'sequence',
        [modify_date] AS 'event_date'
    FROM
        sys.all_objects
)
SELECT
    f.[sequence] AS 'from_event_sequence',
    f.[event_date] AS 'from_event_date',
    t.[sequence] AS 'to_event_sequence',
    t.[event_date] AS 'to_event_date'
FROM
    [Sequenced_Entries] AS f
        INNER JOIN
    [Sequenced_Entries] AS t
        ON (f.[sequence] = t.[sequence] - 1)
WHERE
    datediff(second, f.[event_date], t.[event_date]) < 30
ORDER BY
    f.[sequence] ASC

Don't forget the statement before this one in the batch must be terminated with a ; or this statement must start with one.

0人赞添加讨论(0) 举报

Detect gaps over 30 min in timestamp column

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间