Sql - Merging rows if date connects

2019-05-16 22:41发布

问题:

I have table with rows: clientid, startdate and enddate. Date cant overlap for same clientid. I would like to merge rows for every client if date connects.

table looks like this:

clientid  startdate      enddate
1         10.10.2017     12.10.2017
1         12.10.2017     13.10.2017
1         13.10.2017     17.10.2017
1         10.11.2017     17.11.2017
1         17.11.2017     23.11.2017
1         12.12.2017     14.12.2017
2         10.11.2017     15.11.2017
2         01.12.2017     02.12.2017
2         02.12.2017     05.12.2017

Final table should looks like this:

clientid  startdate      enddate
    1     10.10.2017     17.10.2017
    1     10.11.2017     23.11.2017
    1     12.12.2017     14.12.2017
    2     10.11.2017     15.11.2017
    2     01.12.2017     05.12.2017

Thank you for help.

回答1:

You can use such a logic with sum aggregate and lag window functions as below :

select clientid, min(startdate) as startdate, max(enddate) as enddate
  from
(
select tt.*, sum(grp) over (order by clientid, startdate) sm 
  from
(
  with t(clientid, startdate, enddate) as
  (
   select 1, date'2017-10-10', date'2017-10-12' from dual union all
   select 1, date'2017-10-12', date'2017-10-13' from dual union all
   select 1, date'2017-10-13', date'2017-10-17' from dual union all  
   select 1, date'2017-11-10', date'2017-11-17' from dual union all  
   select 1, date'2017-11-17', date'2017-11-23' from dual union all  
   select 1, date'2017-12-12', date'2017-12-14' from dual union all
   select 2, date'2017-11-10', date'2017-11-15' from dual union all  
   select 2, date'2017-12-01', date'2017-12-02' from dual union all  
   select 2, date'2017-12-02', date'2017-12-05' from dual
  )
 select clientid, 
        decode(nvl(lag(enddate) over 
                   (order by enddate),startdate),startdate,0,1) 
                   as grp, --> means prev. value equals or not 
        row_number() over (order by clientid, enddate) as rn, startdate, enddate
    from t
) tt
order by rn
) 
group by clientid, sm 
order by clientid, enddate;

CLIENTID    STARTDATE   ENDDATE
----------  ----------  ----------
1           10.10.2017  17.10.2017
1           10.11.2017  23.11.2017
1           12.12.2017  14.12.2017
2           10.11.2017  15.11.2017
2           01.12.2017  05.12.2017

Rextester Demo

Step by Step Query Execution for better understanding



回答2:

This is SQL Server syntax, same methodology as Barbaros. In an attempt to be as purist as possible I tried to do a self-join instead of using LAG, but that admittedly makes the query harder to read.

SELECT clientid, MIN(startdate) AS startdate, MAX(enddate) AS enddate
FROM (SELECT *, SUM(CASE WHEN a.enddate_prev = a.startdate THEN 0 ELSE 1 END) OVER (ORDER BY clientid, startdate) sm
      FROM (SELECT clientid, startdate, enddate,  
                   LAG(enddate, 1, NULL) OVER (PARTITION BY clientid ORDER BY clientid, enddate) enddate_prev
            FROM client_dates) a) b
GROUP BY clientid, sm

Set the table like this:

CREATE TABLE client_dates (clientid INT NOT NULL, startdate DATE NOT NULL, enddate DATE NOT NULL);

INSERT INTO client_dates VALUES (1, TRY_PARSE('10.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('12.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('12.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('13.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('13.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('17.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('10.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('17.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('17.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('23.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('12.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('14.12.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('10.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('15.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('01.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('02.12.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('02.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('05.12.2017' AS datetime USING 'en-GB'));