I have a dataset that looks like this:
I need to identify the rows that have Linked set to 1 but ONLY where they are together when sorted by ToDate descending as in the picture.
In other words I want to be able to identify these records (EDITED):
This is a simplified dataset, in fact there will be many more records...
The logic that defines whether a record is linked is if the FromDate of a record is within 8 weeks of the ToDate of the preceeding date...but this is testData so may not be perfect
What's the best way to do that please?
You can use LAG()
and LEAD()
analytic functions:
SELECT * FROM (
SELECT t.*,
LAG(t.linked,1,0) OVER(ORDER BY t.FromDate DESC) as rnk_1, --Next one
LEAD(t.linked,1,0) OVER(ORDER BY t.FromDate DESC) as rnk_2, -- Last one,
LEAD(t.linked,2,0) OVER(ORDER BY t.FromDate DESC) as rnk_3 -- Last two,
FROM YourTable t) s
WHERE ((s.rnk_1 = 1 OR s.rnk_2 = 1) AND s.linked = 1) OR
(s.rnk_2 = 1 and s.rnk_3 = 1 and s.linked = 0)
ORDER BY s.FromDate DESC
This will result in records that have linked = 1
and the previous/next record is also 1.
Using LAG
and LEAD
functions you can examine the previous/next row values given a sort criteria.
You can achieve your required dataset using the following DDL:
;
WITH CTE_LagLead
AS (
SELECT FromDate,
ToDate,
NoOfDays,
Weeks,
Linked,
LAG(Linked, 1, 0) OVER (ORDER BY ToDate DESC) LinkedLag,
LEAD(Linked, 1, 0) OVER (ORDER BY ToDate DESC) LinkedLead
FROM @table
)
SELECT FromDate,
ToDate,
NoOfDays,
Weeks,
Linked
FROM CTE_LagLead
WHERE Linked = 1 AND
(LinkedLag = 1 OR
LinkedLead = 1)
ORDER BY ToDate DESC;
See working example
here is the answer I came up with:
Select
*
from
#tmpAbsences
where
idcol between 1 AND (
Select TOP 1 idcol from #tmpAbsences where Linked=0)
this includes the row 7 in the below picture: