We have a table used for assigning trainers to planned customer installations. In the table, there is a record for each trainer, for each day of the year. (We can, and sometimes do, work weekends.) I am building a search tool that allows our schedulers to search for a trainer that is available X number of days between dates Y and Z.
Table mySchedule
Trainer Date Dirty (Bit)
------------------------------------------------
Joe 06/01/2013 0
Jessica 06/01/2013 0
Alan 06/01/2013 0
Heather 06/01/2013 0
Joe 06/02/2013 1
Jessica 06/02/2013 1
Alan 06/02/2013 0
Heather 06/02/2013 0
Joe 06/03/2013 1
Jessica 06/03/2013 1
Alan 06/03/2013 1
Heather 06/03/2013 0
This is a simplified version of my table, covering four trainers over 3 days. If they have something scheduled, Dirty = 1. If they are free to schedule, Dirty = 0.
What I would like to build is a query that allows for the following:
Define a Beginning and Ending Date that the work needs to occur.
Define the number of consecutive days that the trainer will be required.
Return each Trainer that matches, along with the first date they are available for a period of time at least equal to the number of days being requested.
Plain text example:
The customer asks for a trainer to be onsite for two days anytime in June. The query should return:
Alan, 06/01/2013
Heather, 06/01/2013
If the customer changed the request to three days in June, the query would return:
Heather, 06/01/2013
I've been searching for a few days now, and I've found some things that seemed close, but ultimately, I couldn't get them to work. In most cases, the failure has been in the form of insanely long execution times. Here are a few that seemed promising, and perhaps can be adapted by someone with stronger SQL-Fu than I am packing:
- How to find N Consecutive records in a table using SQL
- Microsoft T-SQL Counting Consecutive Records
- How to find date ranges in records with consecutive dates and duplicate data
not sure how this will perform against a larger data set, but it gets the right results for the data set provided. Missing data points are assumed to be available.
declare @startDate datetime, @endDate datetime, @days int
select @startDate = '6/1/2013', @endDate='6/3/2013', @days=2
select trainer, min(date)
from
(
select trainer,date,
(select top 1 date
from mySchedule sInner
where sInner.date > sOuter.date
and sInner.trainer = sOuter.trainer
and sInner.Dirty = 1
and sInner.date between @startDate and @endDate
order by sInner.date) as nextDirtyDate
from mySchedule sOuter
where sOuter.dirty=0
and sOuter.date between @startDate and @endDate
) sub
group by trainer, nextDirtyDate
having dateDiff(d, min(date), isNull(nextDirtyDate,dateAdd(d,1,@endDate))) >= @days
Assuming that a missing schedule record defaults to "available", something like this should work:
WITH cteRawData As
(
SELECT
S.Trainer,
S.Date,
-- If there are no later busy dates within the period,
-- assume they're free until the end of the period:
IsNull(ED.EndDate, DateAdd(day, 1, @EndDate)) As EndDate
FROM
mySchedule As S
CROSS APPLY
(
SELECT
Min(Date) As EndDate
FROM
mySchedule As S2
WHERE
S2.Trainer = S.Trainer
And
S2.Date > S.Date
And
S2.Date <= @EndDate
And
S2.Dirty = 1
) As ED
WHERE
S.Date Between @StartDate And @EndDate
And
S.Dirty = 0
),
cteData As
(
SELECT
Trainer,
Date,
DateDiff(day, Date, EndDate) As NumberOfDays
FROM
cteRawData
)
SELECT
Trainer,
Min(Date) As EarliestStartDate
FROM
cteData
WHERE
NumberOfDays >= @NumberOfDays
GROUP BY
Trainer
;
http://www.sqlfiddle.com/#!3/7b3e2/17
Note: I dont know how good the performance is,as it uses a running total.
Also it plays with dirty flag to get a positive number when dirty = 0
e.g. CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END
It uses the number of days at 2 places in the query
S.Date - NNN + 1 AS StartDate
HAVING Sum(CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END) = NNN
-- Note the need to put the value of 3 to be put at 2 places
SELECT S.Trainer, S.Date - 3 + 1 AS StartDate,
Sum(CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END) AS RunningAvl
FROM Schedule S INNER JOIN
Schedule B
ON S.Trainer = B.Trainer
WHERE B.Date <= S.Date
AND S.Date BETWEEN '2013-06-01' AND '2013-06-30'
AND B.Date BETWEEN '2013-06-01' AND '2013-06-30'
GROUP BY S.Trainer, S.Date
HAVING Sum(CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END) = 3
ORDER BY S.Trainer, S.Date ;
-- Note the need to put the value of 2 to be put at 2 places
SELECT S.Trainer, S.Date - 2 + 1 AS StartDate,
Sum(CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END) AS RunningAvl
FROM Schedule S INNER JOIN
Schedule B
ON S.Trainer = B.Trainer
WHERE B.Date <= S.Date
AND S.Date BETWEEN '2013-06-01' AND '2013-06-30'
AND B.Date BETWEEN '2013-06-01' AND '2013-06-30'
GROUP BY S.Trainer, S.Date
HAVING Sum(CASE WHEN B.Dirty = 0 THEN 1 ELSE -1 END) = 2
ORDER BY S.Trainer, S.Date
http://www.sqlfiddle.com/#!3/99f2d/1