T-SQL script - logic issues with timeline

2019-04-09 06:11发布

问题:

Two temp tables are created and then loaded...Here's the schema.

Create table #SH ([date] datetime,
        sched_id int,
        agent_id int)

Create table #SD (sched_id int,
        start_minute int,
        length int,
        exception_code int)

(Schema and design is something I can't change unfortunately, both temp tables are loaded from flat files. I can introduce and load new temp tables if needed).

A little background - The #SH header table holds a persons schedule as 'Start_minute' and goes for 'schedule_length' in minutes. For example, if start minute and schedule length were both 480, that would read as 8am (8am = 480th minute) and goes until 4pm (480 minutes later, 4pm = 960th minute)

The #SD table holds exceptions to the header. In the example above, the person would likely have a lunch exception which would be start_minute = 720 and length of 30 (12:00 - 12:30).

Date and agent_id is the only thing I'm interested out of #SH, the exception info in #sd is what I'm interested in.

This query works:

Select [date],#sd.start_minute,#sd.length,#sd.start_minute + #sd.length as 'end_minute',agent_id
from #SH 
inner join #SD on #SD.sched_id = #sh.sched_id

*end_minute is ultimately a calculated value of start+length = end

This returns something like:

   Date     Start  length   end

1 2010-11-11 600    30  630

2 2010-11-11 630    40  670

3 2010-11-11 750    15  765

4 2010-11-11 800    40  840

Now I wish I could say this was over and walk away...but data entry issues exist. In line 1 and 2, the end time of line 1 lines up with the start time in line 2 and should be combined so my results look like this :

Date     Start  length     end

1 2010-11-11 600    70  670

2 2010-11-11 750    15  765

3 2010-11-11 800    40  840

Any idea's on how to build this logic so I get the 3 lines back instead of 4? I'm working on joining the table to itself on #sd1.start + #sd1.length = #sd2.start for now.

And to further complicate...the example above was 2 lines that needed combined. I've come across a record that had 30 1 minute entries in succession that I will need to make into a single record. Fortunately they cannot overlap (you won't have 2 records occupying the same minutes), but I don't think the join statement I'm considering above will work for that.

回答1:

No need for a CTE, all you need is a helper table. Create it once, like so:

Create Table DayMinute(Minute Integer)
Declare @M Integer
Set @M = 1
While (@M <= 24*60)
Begin
  Insert Into DayMinute(Minute) Values(@M)
  Set @M = @M + 1
End

Then, all you need is a bit of tricksiness:

Select 
  DM.Minute,
  SD.Sched_ID
Into #MinutesWithException
From 
  DayMinute As DM
  Inner Join #SD As SD
    On DM.Minute Between SD.Start_Minute And SD.Start_Minute + Length

Select
  MWE.Sched_ID,
  SH.[Date],
  SH.Agent_ID,
  [Start_Minute] = MWE.Minute,
  [End_Minute] = (Select Min(Last.Minute) -- First one to have no successor
                  From #MinutesWithException As Last
                  Where Last.Sched_ID = MWE.Sched_ID
                    And Last.Minute > MWE.Minute
                    And Not Exists(Select *
                                   From #MinutesWithException As Next
                                   Where Next.Sched_ID = MWE.Sched_iD
                                     And Next.Minute = Last.Minute + 1))
From 
  #MinutesWithException As MWE
  Inner Join #SH As SH
    On MWE.Sched_ID = SH.Sched_ID
Where
  Not Exists(Select * -- All those without predecessor
             From #MinutesWithException As Previous
             Where Previous.Sched_ID = MWE.Sched_ID
               And Previous.Minute = MWE.Minute - 1)

Remember, a lot of SQL problems can be solved by rephrasing them. Don't ask "which ranges do not have a gap", ask "which minutes have an interval". The rest follows from there.



回答2:

If you use a recursive CTE to combine the results of the query you have above, you could chain up to 32767 records together. You might consider that approach if you don't think you'll ever get near that amount.

I created a working example, because I wasn't sure. Your grouping would be different, but this is the general idea:

CREATE TABLE times
(
[Date] datetime,
[start] int,
[length] int,
[end] int
)
INSERT INTO times([Date], [Start], [length], [End])
VALUES ('11/11/2010',600,30,630)
INSERT INTO times([Date], [Start], [length], [End])
VALUES ('11/11/2010',630,40,670)
INSERT INTO times([Date], [Start], [length], [End])
VALUES ('11/11/2010',750,15,765)
INSERT INTO times([Date], [Start], [length], [End])
VALUES ('11/11/2010',800,40,840)

;WITH chaintimes AS
(
    SELECT t1.Date, t1.start, t1.length, t1.[end]
    FROM times t1 LEFT JOIN times t2 ON t1.start = t2.[end]
    WHERE t2.[end] IS NULL
    UNION ALL
    SELECT times.Date, chaintimes.start, chaintimes.length + times.length AS length, times.[end]
    FROM times INNER JOIN chaintimes ON times.start = chaintimes.[end]
)
, start_maxlength AS
(
    SELECT date, start, max(length) AS maxlength
    FROM chaintimes
    group by date, start
)
SELECT * FROM chaintimes ct
INNER JOIN start_maxlength ml
ON ct.Date = ml.Date AND ct.start = ml.start AND ct.length = ml.maxlength