Group consecutive rows of same value using time sp

2019-01-25 11:53发布

问题:

Sorry for the vague title (I just don't know how to describe this conundrum)

Give the following schedule table for a classroom:

╔═══════════╦════════════╦═══════════╦═══════════╦═════════╗
║ Classroom ║ CourseName ║  Lesson   ║ StartTime ║ EndTime ║
╠═══════════╬════════════╬═══════════╬═══════════╬═════════╣
║      1001 ║ Course 1   ║ Lesson 1  ║      0800 ║    0900 ║
║      1001 ║ Course 1   ║ Lesson 2  ║      0900 ║    1000 ║
║      1001 ║ Course 1   ║ Lesson 3  ║      1000 ║    1100 ║
║      1001 ║ Course 2   ║ Lesson 10 ║      1100 ║    1200 ║
║      1001 ║ Course 2   ║ Lesson 11 ║      1200 ║    1300 ║
║      1001 ║ Course 1   ║ Lesson 4  ║      1300 ║    1400 ║
║      1001 ║ Course 1   ║ Lesson 5  ║      1400 ║    1500 ║
╚═══════════╩════════════╩═══════════╩═══════════╩═════════╝

I would like to group the table to display this:

╔═══════════╦════════════╦═══════════╦═════════╗
║ Classroom ║ CourseName ║ StartTime ║ EndTime ║
╠═══════════╬════════════╬═══════════╬═════════╣
║      1001 ║ Course 1   ║      0800 ║    1100 ║
║      1001 ║ Course 2   ║      1100 ║    1300 ║
║      1001 ║ Course 1   ║      1300 ║    1500 ║
╚═══════════╩════════════╩═══════════╩═════════╝

Basically we are looking at a schedule that show which crouse is using what classroom during a certain timespan...

My initial thought was: Group by Classroom and CourseName and take Max and Min for start\end time but that will not give me the time spans it will show as if Course 1 is using the Classroom from 08:00 - 16:00 with no break in the middle.

回答1:

The query determines each rows EndTime by using NOT EXISTS to make sure no other class or course of a different type is scheduled between a course range's StartTime and EndTime and then uses MIN and GROUP BY to find the StartTime.

The NOT EXISTS part ensures that there aren't "breaks" between the StartTime and EndTime ranges by searching for any rows that have an EndTime between StartTime and EndTime but belong to a different CourseName or CourseRoom.

SELECT    
    t0.ClassRoom,
    t0.CourseName,
    MIN(t0.StartTime),
    t0.EndTime
FROM (
    SELECT 
    t1.ClassRoom,
    t1.CourseName,
    t1.StartTime,
    (
        SELECT MAX(t2.EndTime)
        FROM tableA t2
        WHERE t2.CourseName = t1.CourseName
        AND t2.ClassRoom = t1.ClassRoom
        AND NOT EXISTS (SELECT 1 FROM tableA t3
            WHERE t3.EndTime < t2.EndTime 
            AND t3.EndTime > t1.EndTime
            AND (t3.CourseName <> t2.CourseName 
            OR t3.ClassRoom <> t2.ClassRoom)
        )
    ) EndTime
    FROM tableA t1
) t0 GROUP BY t0.ClassRoom, t0.CourseName, t0.EndTime

http://www.sqlfiddle.com/#!6/39d4b/9



回答2:

If you're using SQLServer 2012 or better you can use LAG to get the previous value of a column, then SUM() OVER (ORDER BY ...) to create a rolling sum, in this case one that count the change of the CourseName, that can be used as the GROUP BY anchor

With A AS (
  SELECT ClassRoom
       , CourseName
       , StartTime
       , EndTime
       , PrevCourse = LAG(CourseName, 1, CourseName) OVER (ORDER BY StartTime)
  FROM   Table1
), B AS (
  SELECT ClassRoom
       , CourseName
       , StartTime
       , EndTime
       , Ranker = SUM(CASE WHEN CourseName = PrevCourse THEN 0 ELSE 1 END)
                OVER (ORDER BY StartTime, CourseName)
  FROM   A
)
SELECT ClassRoom
     , CourseName
     , MIN(StartTime) StartTime
     , MAX(EndTime) EndTime
FROM   B
GROUP BY ClassRoom, CourseName, Ranker
ORDER BY StartTime

SQLFiddle demo



回答3:

     CREATE TABLE Classroom(Classroom VARCHAR(100), CourseName  VARCHAR(100),  Lesson    VARCHAR(100), StartTime  VARCHAR(100), EndTime  VARCHAR(100))
 INSERT INTO Classroom
SELECT '1001','Course 1','Lesson 1 ','0800','0900'
UNION SELECT '1001','Course 1','Lesson 2 ','0900','1000'
UNION SELECT '1001','Course 1','Lesson 3 ','1000','1100'
UNION SELECT '1001','Course 2','Lesson 10','1100','1200'
UNION SELECT '1001','Course 2','Lesson 11','1200','1300'
UNION SELECT '1001','Course 1','Lesson 4 ','1300','1400'
UNION SELECT '1001','Course 1','Lesson 5 ','1400','1500'

SELECT * FROM Classroom

;WITH CTE_ClassRooms AS (
SELECT *,ROW_NUMBER() over(partition by classroom,CourseName order by StartTime) AS R FROM Classroom A 
WHERE NOT EXISTS(SELECT 1 FROM Classroom B WHERE B.Classroom = A.Classroom AND B.CourseName = A.CourseName AND B.StartTime = A.EndTime)
UNION ALL
SELECT B.*,R fROM CTE_ClassRooms A JOIN Classroom B ON B.Classroom = A.Classroom AND B.CourseName = A.CourseName AND A.StartTime = B.EndTime
)

--SELECT * FROM CTE_ClassRooms order by Classroom,CourseName,R

SELECT Classroom,CourseName,MIN(StartTime),MAX(EndTime)
FROM CTE_ClassRooms
GROUP BY Classroom,CourseName,R