(PostgreSQL 8.4) Table "trackingMessages" stores tracking events between mobile devices (tm_nl_mobileid) and fixed devices (tm_nl_fixedId).
CREATE TABLE trackingMessages
(
tm_id SERIAL PRIMARY KEY, -- PK
tm_nl_mobileId INTEGER, -- FK to mobile
tm_nl_fixedId INTEGER, -- FK to fixed
tm_date INTEGER, -- Network time
tm_messageType INTEGER, -- 0=disconnect, 1=connect
CONSTRAINT tm_unique_row
UNIQUE (tm_nl_mobileId, tm_nl_fixedId, tm_date, tm_messageType)
);
Problem here is that it's possible that the same mobile will connect to the same fixed twice (or more times) subsequently. I don't want to see the subsequent ones, but it's OK to see a mobile connected to a same fixed at a later date, provided there was a connection to a different fixed in between.
I think I'm close but not quite. I've been using the following CTE (found here on Stack Overflow)
WITH cte AS
(
SELECT tm_nl_fixedid, tm_date, Row_number() OVER (
partition BY tm_nl_fixedid
ORDER BY tm_date ASC
) RN
FROM trackingMessages
)
SELECT * FROM cte
WHERE tm_nl_mobileid = 150 AND tm_messagetype = 1
ORDER BY tm_date;
Gives me the following results
32;1316538756;1
21;1316539069;1
32;1316539194;2
32;1316539221;3
21;1316539235;2
The problem here is that the last column should be 1, 1, 1, 2, 1, because that third "32" is in fact a duplicate tracking event (twice in a row at the same fixed) and that last connection to "21" is OK because "32" was in between.
Please don't suggest a cursor, this is what I am currently trying to move away from. The cursor solution does work, but it's too slow given the amount of records I have to deal with. I'd much rather fix the CTE and only select where RN = 1
... unless you have a better idea!
Well, you're not that close because
row_number()
cannot track sequences by two groups at the same time.PARTITION BY tm_nl_fixedid ORDER BY date RESTART ON GAP
does not exist, there's no such thing.Itzik Ben-Gan has a solution for the islands and gaps problem you are facing (several solutions, actually). The idea is to order rows by the main criteria (date) and then by partitioning criteria + main criteria. Difference between ordinals will remain the same as they belong to the same partitioning criteria and date series.
Here is Sql Fiddle with example.
And here is chapter 5 of Sql Server MVP Deep Dives with several solutions to islands and gaps problem.
This should be simpler with the window function
lag()
:Explain
In the CTE,
lag()
gets the last fixed device to which a mobile connected (NULL
for the first row per mobile - that's why I useIS DISTINCT FROM
later, see a different approach here).Then simply exclude all rows where the last fixed device was the same as this one, thereby excluding all "subsequent ones". All done.