I just can't seem to get this query figured out. I need to combine rows of time-consecutive states into a single state.
This question is similar to the question found here except I am working with Oracle 10 not SQL Server: Combine rows when the end time of one is the start time of another
Example data:
name start_inst end_inst code subcode
Person1 9/12/2011 10:55 9/12/2011 11:49 161 50
Person1 9/12/2011 11:49 9/12/2011 11:55 107 28
Person1 9/12/2011 11:55 9/12/2011 12:07 161 50
Person1 9/12/2011 12:07 9/12/2011 12:26 161 50
Person1 9/12/2011 12:26 9/12/2011 12:57 161 71
Person1 9/12/2011 12:57 9/12/2011 13:07 161 71
Person1 9/12/2011 13:07 9/12/2011 13:20 52 50
And I would like to get the following output:
name start_inst end_inst code subcode
Person1 9/12/2011 10:55 9/12/2011 11:49 161 50
Person1 9/12/2011 11:49 9/12/2011 11:55 107 28
Person1 9/12/2011 11:55 9/12/2011 12:26 161 50
Person1 9/12/2011 12:26 9/12/2011 13:07 161 71
Person1 9/12/2011 13:07 9/12/2011 13:20 52 50
Here is example SQL:
CREATE TABLE Data (
name varchar2(132 BYTE) not null,
start_inst DATE not null,
end_inst DATE not null,
code number(3) not null,
subcode number(3) not null
);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 10:55','9/12/2011 11:49',161, 50);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 11:49','9/12/2011 11:55',107,28);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 11:55','9/12/2011 12:07',161,50);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 12:07','9/12/2011 12:26',161,50);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 12:26','9/12/2011 12:57',161,71);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 12:57','9/12/2011 13:07',161,71);
INSERT INTO Data(name,start_inst,end_inst, code, code2) VALUES('Person1','9/12/2011 13:07','9/12/2011 13:20',52,50);
Thanks in advance!
Maybe this? (I don't have a SQL machine to run it on)
Here is a solution using a recursive query instead of analytic functions (as suggested by @wildplasser):
The
connect by
clause in the innermost query causes the data to be returned in a hierarchical fashion.connect_by_root
gives us the value at the root of each branch. Because we don't have a good candidate for astart with
clause, we'll get all child rows (whereend_inst
equals another row'sstart_inst
and all other columns are the same) multiple times: once as a root and once (or more) as a branch. Taking themin
of the root eliminates these extra rows while giving us a value to group on in the outer query.In the outer query, we perform another
group by
to consolidate the rows. The difference is that, in this case, we haveroot_start
there as well to identify which rows are consecutive and therefore need to be consolidated.You can do that with a recursive query (something with CONNECT BY / PRIOR in oracle, IIRC) I did the same thing for Postgres in this thread : Get total time interval from multiple rows if sequence not broken
It might need a bit of reworking to make it fit into the oracle syntax.
Here's another approach:
Basically:
For each row, subquery A finds the previous end time for the given name, code, and subcode.
For each row, subquery B calculates the "group number" -- a running count of preceeding rows (in order of start_inst, name, code, and subcode) where the previous end time calculated in Step 1 is not equal to the start time.
The outer query aggregates by group number.
For better or worse, this approach, unlike @stevo's, will create a new "group" if there's a "gap" between the end time of one record and the start time of the next. For example, if you were to create a gap between 12:57 and 13:00 like this...
...the query above would return two rows like this...
...whereas @stevo's query would return one row like this...
Hope this helps.
adapting desm's query, I think this should work