I have table with rows: clientid, startdate and enddate. Date cant overlap for same clientid.
I would like to merge rows for every client if date connects.
table looks like this:
clientid startdate enddate
1 10.10.2017 12.10.2017
1 12.10.2017 13.10.2017
1 13.10.2017 17.10.2017
1 10.11.2017 17.11.2017
1 17.11.2017 23.11.2017
1 12.12.2017 14.12.2017
2 10.11.2017 15.11.2017
2 01.12.2017 02.12.2017
2 02.12.2017 05.12.2017
Final table should looks like this:
clientid startdate enddate
1 10.10.2017 17.10.2017
1 10.11.2017 23.11.2017
1 12.12.2017 14.12.2017
2 10.11.2017 15.11.2017
2 01.12.2017 05.12.2017
Thank you for help.
You can use such a logic with sum
aggregate and lag
window functions as below :
select clientid, min(startdate) as startdate, max(enddate) as enddate
from
(
select tt.*, sum(grp) over (order by clientid, startdate) sm
from
(
with t(clientid, startdate, enddate) as
(
select 1, date'2017-10-10', date'2017-10-12' from dual union all
select 1, date'2017-10-12', date'2017-10-13' from dual union all
select 1, date'2017-10-13', date'2017-10-17' from dual union all
select 1, date'2017-11-10', date'2017-11-17' from dual union all
select 1, date'2017-11-17', date'2017-11-23' from dual union all
select 1, date'2017-12-12', date'2017-12-14' from dual union all
select 2, date'2017-11-10', date'2017-11-15' from dual union all
select 2, date'2017-12-01', date'2017-12-02' from dual union all
select 2, date'2017-12-02', date'2017-12-05' from dual
)
select clientid,
decode(nvl(lag(enddate) over
(order by enddate),startdate),startdate,0,1)
as grp, --> means prev. value equals or not
row_number() over (order by clientid, enddate) as rn, startdate, enddate
from t
) tt
order by rn
)
group by clientid, sm
order by clientid, enddate;
CLIENTID STARTDATE ENDDATE
---------- ---------- ----------
1 10.10.2017 17.10.2017
1 10.11.2017 23.11.2017
1 12.12.2017 14.12.2017
2 10.11.2017 15.11.2017
2 01.12.2017 05.12.2017
Rextester Demo
Step by Step Query Execution for better understanding
This is SQL Server syntax, same methodology as Barbaros. In an attempt to be as purist as possible I tried to do a self-join instead of using LAG
, but that admittedly makes the query harder to read.
SELECT clientid, MIN(startdate) AS startdate, MAX(enddate) AS enddate
FROM (SELECT *, SUM(CASE WHEN a.enddate_prev = a.startdate THEN 0 ELSE 1 END) OVER (ORDER BY clientid, startdate) sm
FROM (SELECT clientid, startdate, enddate,
LAG(enddate, 1, NULL) OVER (PARTITION BY clientid ORDER BY clientid, enddate) enddate_prev
FROM client_dates) a) b
GROUP BY clientid, sm
Set the table like this:
CREATE TABLE client_dates (clientid INT NOT NULL, startdate DATE NOT NULL, enddate DATE NOT NULL);
INSERT INTO client_dates VALUES (1, TRY_PARSE('10.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('12.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('12.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('13.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('13.10.2017' AS datetime USING 'en-GB'), TRY_PARSE('17.10.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('10.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('17.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('17.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('23.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (1, TRY_PARSE('12.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('14.12.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('10.11.2017' AS datetime USING 'en-GB'), TRY_PARSE('15.11.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('01.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('02.12.2017' AS datetime USING 'en-GB'));
INSERT INTO client_dates VALUES (2, TRY_PARSE('02.12.2017' AS datetime USING 'en-GB'), TRY_PARSE('05.12.2017' AS datetime USING 'en-GB'));