(PostgreSQL 8.4) Continuing with my previous example, I wish to further my understanding of gaps-and-islands processing with Window-functions. Consider the following table and data:
CREATE TABLE T1
(
id SERIAL PRIMARY KEY,
val INT, -- some device
status INT -- 0=OFF, 1=ON
);
INSERT INTO T1 (val, status) VALUES (10, 0);
INSERT INTO T1 (val, status) VALUES (11, 0);
INSERT INTO T1 (val, status) VALUES (11, 1);
INSERT INTO T1 (val, status) VALUES (10, 1);
INSERT INTO T1 (val, status) VALUES (11, 0);
INSERT INTO T1 (val, status) VALUES (10, 0);
As previously explained, the devices turn ON and OFF and this time I wish to extract a specific sequence:
- show all new
ON
status records that aren't duplicate (same device twice in a row) - show appropriate
OFF
status from currentlyON
device
The closest I could get is this:
SELECT * FROM (
SELECT *
,lag(val, 1, 0) OVER (PARTITION BY status ORDER BY id) last_val
,lag(status, 1, -1) OVER (PARTITION BY val ORDER BY id) last_status
FROM t1
) x
WHERE (last_val <> val OR last_status <> status)
AND (status = 1 OR last_status <> -1)
ORDER BY id
This filters out more bogus data that the sample doesn't include but essentially it's about taking out subsequent duplicates (regardless of status) and the top OFF
records which don't match. Records 3
, 4
, 5
and 6
are returned, but I don't want the fifth, it's an OFF
which came after a new ON
. So I need to jump that gap and look for the next proper OFF
for the currently active device.
- 10 turns OFF -- bogus in this context, but messes up the lag()
- 11 turns OFF -- bogus in this context, but messes up the lag()
- 11 turns ON -- OK, new sequence, include in SELECT
- 10 turns ON -- OK, new sequence, include in SELECT
- 11 turns OFF -- message came in late, need to ignore gap
- 10 turns OFF -- OK, proper OFF to row 4, need to include in SELECT
Once filtered properly, I would like to use lead()
on top of it all to get the next row's id (imagine a timestamp) as well as filter out all records that aren't ON
statuses. I imagine this will require three embedded SELECT statements. This would get me a clear understanding of how long a device was active, until the condition of either another ON
or a proper turn OFF
.