Find Total Minutes Ignoring Overlap (Convert Curso

2020-03-26 04:18发布

问题:

There is an existing question that asked how to find how many minutes there are in multiple date ranges, ignoring overlaps.

The example data given is (userID isn't particularly relevant)

--Available--
ID  userID  availStart          availEnd
1   456     '2012-11-19 16:00'  '2012-11-19 17:00'
2   456     '2012-11-19 16:00'  '2012-11-19 16:50'
3   456     '2012-11-19 18:00'  '2012-11-19 18:30'
4   456     '2012-11-19 17:30'  '2012-11-19 18:10'
5   456     '2012-11-19 16:00'  '2012-11-19 17:10'
6   456     '2012-11-19 16:00'  '2012-11-19 16:50'

I can solve the problem using a cursor, but I think it should be adaptable to a CTE, however I can't figure out how to do it.

The method is to arrange each range by start time Then we build a range that coalesces ranges in order, until we find a range that doesn't overlap our coalesced range. We then calculate how many minutes are in the coalesced range, and remember this We carry on with the next ranges, again coalesing any that overlap. We accumulate minutes each time we get a non overlapping start point At the end we add the accumulated minutes onto the length of the last range

It's fairly easy to see that because of the order, once a range is distinct from what's gone before then no further ranges could overlap what's gone before, as their start dates are all greater.

Declare
  @UserID int = 456,
  @CurStart datetime, -- our current coalesced range start
  @CurEnd datetime, -- our current coalesced range end
  @AvailStart datetime, -- start or range for our next row of data
  @AvailEnd datetime, -- end of range for our next row of data
  @AccumMinutes int = 0 -- how many minutes so far accumulated by distinct ranges

Declare MinCursor Cursor Fast_Forward For
Select
  AvailStart, AvailEnd
From
  dbo.Available
Where
  UserID = @UserID
Order By
  AvailStart

Open MinCursor

Fetch Next From MinCursor Into @AvailStart, @AvailEnd
Set @CurStart = @AvailStart
Set @CurEnd = @AvailEnd

While @@Fetch_Status = 0
Begin
  If @AvailStart <= @CurEnd -- Ranges Overlap, so coalesce and continue
    Begin
    If @AvailEnd > @CurEnd 
      Set @CurEnd = @AvailEnd
    End
  Else -- Distinct range, coalesce minutes from previous range
  Begin
    Set @AccumMinutes = @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd)
    Set @CurStart = @AvailStart -- Start coalescing a new range
    Set @CurEnd = @AvailEnd
  End
  Fetch Next From MinCursor Into @AvailStart, @AvailEnd
End

Select @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) As TotalMinutes

Close MinCursor
Deallocate MinCursor;

Got the CTE working, was just a silly error in the recursion. The query plan explosion is quite impressive:

With OrderedRanges as (
  Select
    Row_Number() Over (Partition By UserID Order By AvailStart) AS RN,
    AvailStart,
    AvailEnd
  From
    dbo.Available
  Where
    UserID = 456
),
AccumulateMinutes (RN, Accum, CurStart, CurEnd) as (
  Select
    RN, 0, AvailStart, AvailEnd
  From
    OrderedRanges
  Where 
    RN = 1
  Union All
  Select
    o.RN, 
    a.Accum + Case When o.AvailStart <= a.CurEnd Then
        0
      Else 
        DateDiff(Minute, a.CurStart, a.CurEnd)
      End,
    Case When o.AvailStart <= a.CurEnd Then 
        a.CurStart
      Else
        o.AvailStart
      End,
    Case When o.AvailStart <= a.CurEnd Then
        Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End
      Else
        o.AvailEnd
      End
  From
    AccumulateMinutes a
        Inner Join 
    OrderedRanges o On 
        a.RN = o.RN - 1
)

Select Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes 

Is this adaptable to a CTE, and is there a general pattern for accumulating over a list in thie way?

http://sqlfiddle.com/#!6/ac021/2

回答1:

The following query finds the periods in the data, according to your definition. It uses correlated subqueries first to determine whether a record is the start of a period (that is, no overlap with earlier time periods). It then assigns the "periodStart" as the most recent start that is the beginning of a non-overlapping period.

The following (untested) query takes this approach:

with TimeWithOverlap as (
     select t.*,
            (case when exists (select * from dbo.Available tbefore where t.availStart > tbefore.availStart and tbefore.availEnd >= t.availStart)
                  then 0
                  else 1
             end) as IsPeriodStart
     from dbo.Available t 
    ),
    TimeWithPeriodStart as (
     select two.*,
            (select MAX(two1.AvailStart) from TimeWithOverlap two1 where IsPeriodStart = 1 and two1.AvailStart <= two.AvailStart
            ) as periodStart
     from TimeWithOverlap two
    )
select periodStart, MAX(AvailEnd) as periodEnd
from TimeWithPeriodStart twps
group by periodStart;

http://sqlfiddle.com/#!6/3483c/20 (Second Query)

If two periods both start at the same time, then it still works, because the AvailStart values are the same. Because of the correlated subqueries, this might not perform very well on even medium sized data sets.

There are other methods for approaching this. For instance, if you had SQL Server 2012, you would be able to use cumulative sum functions, which offer a simpler method.



回答2:

I solved that (well, in a way) very efficiently by creating a dumb table having the date and time (accurate be the minute) in one column (PK) and a bit in the second. A '1' meant, the user is available and 0 meant, he/she's not.

The rest is dead simple. I was sick of having to write endless complicated queries in trying to get the minutes in partly overlapping time ranges.

In fact, this was for computing machine efficiency.

I know this is not the real deal but the most simple solution I came up with. You might create a function/SP which creates that table..