可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
The scenario is users specify when they are available, these specified times can overlap each other. I'm trying to get the total time they are available for. Example with SQL Fiddle:
--Available--
ID userID availStart availEnd
1 456 '2012-11-19 16:00' '2012-11-19 17:00'
2 456 '2012-11-19 16:00' '2012-11-19 16:50'
3 456 '2012-11-19 18:00' '2012-11-19 18:30'
4 456 '2012-11-19 17:30' '2012-11-19 18:10'
5 456 '2012-11-19 16:00' '2012-11-19 17:10'
6 456 '2012-11-19 16:00' '2012-11-19 16:50'
The output should be 130 minutes:
1: 60
2: 0 as falls inside 1
3: 30
4: 30 as the last 10 mins is covered by 3
5: 10 as first 60 mins is covered by 1
6: 0 as falls inside 1
I can get the total overlapping minutes, however this is more than the SUM of the available minutes:
SQL Fiddle
Any ideas how I can achieve this?
EDIT 21st Nov 12: Thanks so far for everyone's solutions - in a way I'm happy to see this wasn't an 'easy' query to write.
EDIT 23rd Nov 12: This is all great work. Internally here we're thinking it might be best to ensure users cannot enter overlapping times (eg forcing them to amend an existing entry)!
回答1:
Gordon Linoff has a CTE based answer
I've done some performance analysis on all the working algorithms
Blank values mean it took too long. This is tested on a single Core i7 X920 @2GHz chip, backed by a couple of SSDs. The only index created was a cluster on UserID, AvailStart. If you think you can improve any of the performance, let me know.
This CTE version was worse than linear, SQL Server can't do the RN = RN + 1 join in an efficient way. I rectified this with a hybrid approach below, where I save and index the first CTE into a table variable. This still takes ten times as much IO as the cursor based approach.
With OrderedRanges as (
Select
Row_Number() Over (Partition By UserID Order By AvailStart) AS RN,
AvailStart,
AvailEnd
From
dbo.Available
Where
UserID = 456
),
AccumulateMinutes (RN, Accum, CurStart, CurEnd) as (
Select
RN, 0, AvailStart, AvailEnd
From
OrderedRanges
Where
RN = 1
Union All
Select
o.RN,
a.Accum + Case When o.AvailStart <= a.CurEnd Then
0
Else
DateDiff(Minute, a.CurStart, a.CurEnd)
End,
Case When o.AvailStart <= a.CurEnd Then
a.CurStart
Else
o.AvailStart
End,
Case When o.AvailStart <= a.CurEnd Then
Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End
Else
o.AvailEnd
End
From
AccumulateMinutes a
Inner Join
OrderedRanges o On
a.RN = o.RN - 1
)
Select Max(Accum + datediff(Minute, CurStart, CurEnd)) From AccumulateMinutes
http://sqlfiddle.com/#!6/ac021/2
After doing some performance analysis, here's a hybrid CTE/table variable version that performs better than anything except the cursor based approach
Create Function dbo.AvailMinutesHybrid(@UserID int) Returns Int As
Begin
Declare @UserRanges Table (
RN int not null primary key,
AvailStart datetime,
AvailEnd datetime
)
Declare @Ret int = Null
;With OrderedRanges as (
Select
Row_Number() Over (Partition By UserID Order By AvailStart) AS RN,
AvailStart,
AvailEnd
From
dbo.Available
Where
UserID = @UserID
)
Insert Into @UserRanges Select * From OrderedRanges
;With AccumulateMinutes (RN,Accum, CurStart, CurEnd) as (
Select
RN, 0, AvailStart, AvailEnd
From
@UserRanges
Where
RN = 1
Union All
Select
o.RN,
a.Accum + Case When o.AvailStart <= a.CurEnd Then
0
Else
DateDiff(Minute, a.CurStart, a.CurEnd)
End,
Case When o.AvailStart <= a.CurEnd Then
a.CurStart
Else
o.AvailStart
End,
Case When o.AvailStart <= a.CurEnd Then
Case When a.CurEnd > o.AvailEnd Then a.CurEnd Else o.AvailEnd End
Else
o.AvailEnd
End
From
AccumulateMinutes a
Inner Join
@UserRanges o On
a.RN + 1 = o.RN
)
Select
@Ret = Max(Accum + datediff(Minute, CurStart, CurEnd))
From
AccumulateMinutes
Option
(MaxRecursion 0)
Return @Ret
End
http://sqlfiddle.com/#!6/bfd94
回答2:
Here's another way of doing it with a cursor. I feel this techinique should be adaptable to a CTE, but I can't figure out how to do it
The method is to arrange each range by start time
Then we build a range that coalesces ranges in order, until we find
a range that doesn't overlap our coalesced range.
We then calculate how many minutes are in the coalesced range, and remember this
We carry on with the next ranges, again coalesing any that overlap.
We accumulate minutes each time we get a non overlapping start point
At the end we add the accumulated minutes onto the length of the last range
It's fairly easy to see that because of the order, once a range is distinct from
what's gone before then no further ranges could overlap what's gone before, as their
start dates are all greater.
Declare
@UserID int = 456,
@CurStart datetime, -- our current coalesced range start
@CurEnd datetime, -- our current coalsced range end
@AvailStart datetime, -- start or range for our next row of data
@AvailEnd datetime, -- end of range for our next row of data
@AccumMinutes int = 0 -- how many minutes so far accumulated by distinct ranges
Declare MinCursor Cursor Fast_Forward For
Select
AvailStart, AvailEnd
From
dbo.Available
Where
UserID = @UserID
Order By
AvailStart
Open MinCursor
Fetch Next From MinCursor Into @AvailStart, @AvailEnd
Set @CurStart = @AvailStart
Set @CurEnd = @AvailEnd
While @@Fetch_Status = 0
Begin
If @AvailStart <= @CurEnd -- Ranges Overlap, so coalesce and continue
Begin
If @AvailEnd > @CurEnd
Set @CurEnd = @AvailEnd
End
Else -- Distinct range, coalesce minutes from previous range
Begin
Set @AccumMinutes = @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd)
Set @CurStart = @AvailStart -- Start coalescing a new range
Set @CurEnd = @AvailEnd
End
Fetch Next From MinCursor Into @AvailStart, @AvailEnd
End
Select @AccumMinutes + DateDiff(Minute, @CurStart, @CurEnd) As TotalMinutes
Close MinCursor
Deallocate MinCursor;
http://sqlfiddle.com/#!6/3483c/15
回答3:
The main problem is that you can have chains of overlapping entries, so you need to combine an indefinite amount of times to remove all the overlap - this is more suited to a procedural method than SQL. But if you would prefer to not use temporary tables, here's a CTE method - keep in mind that CTEs can only recurse a given number of times, so if you have any particularly long chains, it will fail.
WITH MergedAvailable
AS
(
SELECT Available.UserID, Available.AvailStart, MAX(Available.AvailEnd) AS AvailEnd
FROM Available
WHERE (
SELECT COUNT(*)
FROM Available AS InnerAvailable
WHERE InnerAvailable.AvailStart < Available.AvailStart
AND
InnerAvailable.AvailEnd >= Available.AvailStart
) = 0
GROUP BY Available.UserID, Available.AvailStart
UNION ALL
SELECT MergedAvailable.UserID, MergedAvailable.AvailStart,
LongestExtensionToAvailableInterval.NewIntervalEnd
FROM MergedAvailable
CROSS APPLY GetLongestExtensionToAvailableInterval(MergedAvailable.UserID,
MergedAvailable.AvailStart,
MergedAvailable.AvailEnd) AS LongestExtensionToAvailableInterval
WHERE LongestExtensionToAvailableInterval.NewIntervalEnd IS NOT NULL
)
SELECT SUM(DATEDIFF(MINUTE,
FinalAvailable.AvailStart,
FinalAvailable.AvailEnd)) AS MinsAvailable
FROM (
SELECT MergedAvailable.UserID, MergedAvailable.AvailStart,
MAX(MergedAvailable.AvailEnd) AS AvailEnd
FROM MergedAvailable
GROUP BY MergedAvailable.UserID, MergedAvailable.AvailStart
) AS FinalAvailable
This table function is required:
CREATE FUNCTION GetLongestExtensionToAvailableInterval
(
@UserID int,
@CurrentIntervalStart datetime,
@CurrentIntervalEnd datetime
)
RETURNS TABLE
AS
RETURN
SELECT MAX(Available.AvailEnd) AS NewIntervalEnd
FROM Available
WHERE Available.UserID = @UserID
AND
Available.AvailStart > @CurrentIntervalStart
AND
Available.AvailStart <= @CurrentIntervalEnd
AND
Available.AvailEnd > @CurrentIntervalEnd
The general idea is that it starts from all ranges where the start of the range isn't overlapping anything, and then with every recursion it extends the current range to the furthest extent of the currently overlapping ranges. The table function is needed to determine the furthest extent, as recursing sections of CTEs are not allowed to included plain aggregates.
With the data you've provided, the starting rows are:
456 2012-11-19 16:00 2012-11-19 17:10
456 2012-11-19 17:30 2012-11-19 18:10
The only row which ends up being added via the recursion is:
456 2012-11-19 17:30 2012-11-19 18:30
For the sake of the example, say you had a row with ID 7 which went from 18:20 to 19:20. Then there would be a second recursion which brought back the row:
456 2012-11-19 17:30 2012-11-19 19:20
So while the query will get to the start and end of each overlapping range, it will also be bringing back all the intermediate stages. This is why we need to take the aggregate maximum end date for each start date after the CTE, to remove them.
回答4:
Condition t1.availStart > t2.availEnd OR t1.availEnd < t2.availStart check period which will never be crossed. If it is crossed then minimum availStart or maximum availEnd esle availStart or availEnd.
Probably more than one crossing period.
In your case it
16:00:00 - 17:10:00 includes the ranges:16:00:00 - 16:50:00,
16:00:00 - 16:50:00,
16:00:00 - 17:00:00,
16:00:00 - 17:10:00
17:30:00 - 18:30:00 includes the ranges:17:30:00 - 18:10:00,
18:00:00 - 18:30:00
UPDATE 21.11.2012; 30.11.2012; 04.01.2013
CREATE FUNCTION dbo.Overlap
(
@availStart datetime,
@availEnd datetime,
@availStart2 datetime,
@availEnd2 datetime
)
RETURNS TABLE
RETURN
SELECT CASE WHEN @availStart >= @availEnd2 OR @availEnd <= @availStart2
THEN @availStart ELSE
CASE WHEN @availStart > @availStart2 THEN @availStart2 ELSE @availStart END
END AS availStart,
CASE WHEN @availStart >= @availEnd2 OR @availEnd <= @availStart2
THEN @availEnd ELSE
CASE WHEN @availEnd > @availEnd2 THEN @availEnd ELSE @availEnd2 END
END AS availEnd
;WITH cte AS
(
SELECT userID, availStart, availEnd, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY AvailStart) AS Id
FROM dbo.test53
), cte2 AS
(
SELECT Id, availStart, availEnd
FROM cte
WHERE Id = 1
UNION ALL
SELECT c.Id, o.availStart, o.availEnd
FROM cte c JOIN cte2 ct ON c.Id = ct.Id + 1
CROSS APPLY dbo.Overlap(c.availStart, c.availEnd, ct.availStart, ct.availEnd) AS o
)
SELECT TOP 1 SUM(DATEDIFF(minute, availStart, MAX(availEnd))) OVER()
FROM cte2
GROUP BY availStart
Demo on SQLFiddle
回答5:
Create Table #Available (
ID int not null primary key,
UserID int not null,
AvailStart datetime not null,
AvailEnd datetime not null
)
Insert Into #Available (ID,UserID, AvailStart, AvailEnd) Values
(1,456, '2012-11-19 16:00', '2012-11-19 17:00'),
(2,456, '2012-11-19 16:00', '2012-11-19 16:50'),
(3,456, '2012-11-19 18:00', '2012-11-19 18:30'),
(4,456, '2012-11-19 17:30', '2012-11-19 18:10'),
(5,456, '2012-11-19 16:00', '2012-11-19 17:10'),
(6,456, '2012-11-19 16:00', '2012-11-19 16:50'),
(7,457, '2012-11-19 16:00', '2012-11-19 17:10'),
(8,457, '2012-11-19 16:00', '2012-11-19 16:50');
Select Distinct UserID
into #users
from #Available
Create Table #mins(UserID int,atime datetime,aset tinyint )
Declare @start Datetime
Declare @end Datetime
Select @start=min(AvailStart),@end=max(AvailEnd) from #Available
While @start<@end
begin
insert into #mins(UserID,atime)
Select UserID ,@Start from #users
Select @start=DateAdd(mi,1,@start)
end
update #mins set aset=1
from #Available
where atime>=AvailStart and atime<Availend and #mins.UserID = #Available.UserID
select UserID,SUM(aset) as [Minutes]
from #mins
Group by UserID
Drop table #Available
Drop table #mins
Drop table #users