Merging data in a single SQL table without a Curso

2019-03-09 03:21发布

问题:

I have a table with an ID column and another column with a number. One ID can have multiple numbers. For example

ID | Number
 1 |  25
 1 |  26
 1 |  30
 1 |  24
 2 |  4
 2 |  8
 2 |  5

Now based of this data, in a new table, I want to have this

ID | Low | High 
1  |  24 |  26
1  |  30 |  30
2  |  4  |  5
2  |  8  |  8

As you can see, I want to merge any data where the numbers are consecutive, like 24, 25, 26. So now the low was 24, the high was 26, and then 30 is still a separate range. I am dealing with large amounts of data, so I would prefer to not use a cursor for performance sake (which is what I was previously doing, and was slowing things down quite a bit)...What is the best way to achieve this? I'm no SQL pro, so I'm not sure if there is a function available that could make this easier, or what the fastest way to accomplish this would be.

Thanks for the help.

回答1:

The key observation is that a sequence of numbers minus another sequence is a constant. We can generate another sequence using row_number. This identifies all the groups:

select id, MIN(number) as low, MAX(number) as high
from (select t.*,
             (number - ROW_NUMBER() over (partition by id order by number) ) as groupnum
      from t
     ) t
group by id, groupnum

The rest is just aggregation.



回答2:

Solution with CTE and recursion:

WITH CTE AS (
  SELECT T.ID, T.NUMBER, T.NUMBER AS GRP
  FROM T 
  LEFT OUTER JOIN T T2 ON T.ID = T2.ID AND T.NUMBER -1 = T2.NUMBER 
  WHERE T2.ID IS NULL
  UNION  ALL
  SELECT T.ID, T.NUMBER, GRP
  FROM CTE 
  INNER JOIN T
  ON T.ID = CTE.ID AND T.NUMBER  = CTE.NUMBER + 1
)
SELECT ID, MAX( NUMBER ), MIN(NUMBER)
FROM CTE
GROUP BY ID, GRP

Results at fiddlesql



回答3:

I'd suggest using a WHILE loop structure with a table variable instead of the cursor.

For example,

DECLARE @TableVariable TABLE
(
    MyID int IDENTITY (1, 1) PRIMARY KEY NOT NULL,
    [ID] int,
    [Number] int
)

DECLARE @Count int, @Max int

INSERT INTO @TableVariable (ID, Number)
SELECT ID, Number
FROM YourSourceTable

SELECT @Count = 1, @Max = MAX(MyID)
FROM @TableVariable

WHILE @Count <= @Max
BEGIN

    ...do your processing here...


    SET @Count = @Count + 1

END


回答4:

CREATE TABLE Table1
    ([ID] int, [Number] int)
;

INSERT INTO Table1
    ([ID], [Number])
VALUES
    (1, 25),
    (1, 26),
    (1, 30),
    (1, 24),
    (2, 4),
    (2, 8),
    (2, 5)
;

    select ID, 
           MIN(Number)
         ,(SELECT MIN(Number) 
                  FROM (SELECT TOP 2 Number from Table1 WHERE ID =
                  T1.Id ORDER BY Number DESC) as DT)
    from Table1 as T1
    GROUP BY ID
    UNION 
    SELECT ID, MAX(Number), MAX(Number)
    FROM Table1 as T1
    GROUP BY ID;

Live Example