SQL: Group By on Consecutive Records

2019-01-14 17:53发布

问题:

A slightly tricky SQL question (we are running SQL server 2000).

I have the following table, StoreCount -

WeekEndDate    StoreCount
2010-07-25     359
2010-07-18     359
2010-07-11     358
2010-07-04     358
2010-06-27     358
2010-06-20     358
2010-06-13     358
2010-06-06     359
2010-05-30     360
2010-05-23     360
2010-05-16     360

I want to turn this into the following output -

StartDate    EndDate       StoreCount
2010-07-18   2010-07-25    359
2010-06-13   2010-07-11    358
2010-06-06   2010-06-06    359
2010-05-16   2010-05-30    360

As you can see, I'm wanting to group the store counts, by only as they run in sequence together.

回答1:

Here's a kick at the can, only it may have syntax not available in SS2k. It was actually written on Oracle as I don't have that version of SS around anymore. The only catch might be the the select of a select...(it's been a while since I've used SS2k, so it's hard to remember what features weren't available back then.)

select min(weekenddate) as start_date, end_date, storecount
from (
select s1.weekenddate
     , (select max(weekenddate)
          from store_stats s2
         where s2.storecount = s1.storecount
           and not exists (select null
                             from store_stats s3
                            where s3.weekenddate < s2.weekenddate
                              and s3.weekenddate > s1.weekenddate
                              and s3.storecount <> s1.storecount)
       ) as end_date
     , s1.storecount
from store_stats s1
) result
group by end_date, storecount
order by 1 desc


START_DATE END_DATE   STORECOUNT
---------- ---------- ----------
2010-07-18 2010-07-25        359
2010-06-13 2010-07-11        358
2010-06-06 2010-06-06        359
2010-05-16 2010-05-30        360


回答2:

Use cursor. I don't know how to do it in sql2k by using query.

DECLARE @w datetime
DECLARE @s int
DECLARE @prev_s int
DECLARE @start_w datetime
DECLARE @end_w datetime

CREATE TABLE #zz(start datetime, [end] datetime, StoreCount int)

DECLARE a_cursor CURSOR
FOR SELECT WeekEndDate, StoreCount FROM Line ORDER BY WeekEndDate DESC, StoreCount
OPEN a_cursor
FETCH NEXT FROM a_cursor INTO @w, @s
WHILE @@FETCH_STATUS = 0
BEGIN

    IF @end_w IS NULL 
    BEGIN

        SET @end_w = @w
        SET @start_w = @w     
        SET @prev_s = @s
    END 
    ELSE IF @prev_s <> @s
    BEGIN
       INSERT INTO #zz values(@start_w,  @end_w, @prev_s)

       SET @end_w = @w  
       SET @start_w = @w  
       SET @prev_s = @s 
    END ELSE
        SET @start_w = @w 

    FETCH NEXT FROM a_cursor INTO @w, @s
END
-- add last one
INSERT INTO #zz values(@start_w, @end_w, @prev_s)

CLOSE a_cursor
DEALLOCATE a_cursor

SELECT * FROM #zz ORDER BY 1 DESC
DROP TABLE #zz


回答3:

Ok, here's my go at it.

DECLARE @curDate DATETIME = (SELECT MIN(WeekEndDate) FROM Table_1);
    DECLARE @curCount INT = (SELECT StoreCount FROM Table_1 WHERE WeekEndDate = @curDate);
    DECLARE @sDate DATETIME = GETDATE()
    DECLARE @eDate DATETIME = 0


    WHILE @eDate < (SELECT MAX(WeekEndDate) FROM Table_1)
    BEGIN
        SELECT @sDate = (SELECT WeekEndDate AS StartDate FROM Table_1 WHERE WeekEndDate = @curDate) -- SELECT START DATE

        -- NOW GET THE END DATE IN THIS GROUP
        DECLARE @d1 DATETIME = @curDate
        DECLARE @d2 DATETIME = @curDate
        DECLARE @loop INT = 1
        WHILE @loop = 1
            BEGIN
                IF ((SELECT StoreCount FROM Table_1 WHERE WeekEndDate = @d1) <> @curCount OR @d1 = (SELECT MAX(WeekEndDate) FROM Table_1)) BEGIN
                    SELECT @eDate = (SELECT TOP(1) WeekEndDate FROM Table_1 WHERE StoreCount = @curCount AND WeekEndDate = @d2 ORDER BY WeekEndDate DESC)
                    SELECT @loop = 0 END
                ELSE BEGIN
                    SELECT @d2 = @d1 
                    SELECT @d1 = (SELECT TOP(1) WeekEndDate FROM Table_1 WHERE WeekEndDate > @d1 ORDER BY WeekEndDate) END
            END


        SELECT @sDate AS StartDate, @eDate AS EndDate, @curCount AS StoreCount   -- DO QHATEVER YOU NEED TO DO WITH THE RECORDS HERE

        SELECT TOP(1) @curDate = WeekEndDate, @curCount = StoreCount
        FROM Table_1
        WHERE WeekEndDate > @eDate
        GROUP BY WeekEndDate, StoreCount
        ORDER BY WeekEndDate ASC

    END


回答4:

I'm not sure how to explain this, but it seems to give the desired result for the small dataset given. In essence, it detects the points in the series where the values change.

I haven't looked at the query plan, might be painful.

Tried on a Sybase server, so syntax should be compatible with SQL Server 2K.

SELECT  x.StartDate
        , MIN( y.EndDate ) AS EndDate
        , x.StoreCount
FROM
( SELECT
        wed1.WeekEndDate AS StartDate
        , wed1.StoreCount
FROM
        wed wed1
LEFT JOIN
        wed wed2
ON      wed1.WeekEndDate = DATEADD( DAY, 7, wed2.WeekEndDate )
WHERE
        wed1.StoreCount != ISNULL( wed2.StoreCount, wed1.StoreCount - 1 )
) x,
( SELECT
        wed1.WeekEndDate AS EndDate
FROM
        wed wed1
LEFT JOIN
        wed wed2
ON      wed1.WeekEndDate = DATEADD( DAY, -7, wed2.WeekEndDate )
WHERE
        wed1.StoreCount != ISNULL( wed2.StoreCount, wed1.StoreCount - 1 )
) y
WHERE
        y.EndDate >= x.StartDate
GROUP BY
        x.StartDate
HAVING
        x.StartDate = MIN( x.StartDate )
ORDER BY
        1 DESC

StartDate    EndDate      StoreCount
------------ ------------ -----------
 Jul 18 2010  Jul 25 2010         359
 Jun 13 2010  Jul 11 2010         358
 Jun  6 2010  Jun  6 2010         359
 May 16 2010  May 30 2010         360


回答5:

try this simple solution:

create table x  (weekEndDate char(10), storeCount int);
insert into x values
('2010-07-25',359),
('2010-07-18',359),
('2010-07-11',358),
('2010-07-04',358),
('2010-06-27',358),
('2010-06-20',358),
('2010-06-13',358),
('2010-06-06',359),
('2010-05-30',360),
('2010-05-23',360),
('2010-05-16',360);
select min(weekenddate) as startdate, max(weekenddate) as enddate, min(storecount) as storecount 
from 
(select weekenddate, storecount, concat(row_number() over (order by weekenddate) -row_number() over (partition by storecount order by weekenddate),'|',storecount) as groupkey from x) w
group by groupkey order by startdate desc;

sqlfiddle