I have the following table ordered by Id, Year DESC
Id Year Valid
1 2011 1
1 2010 1
1 2009 0
1 2002 1
4 2013 1
4 2012 1
4 2011 1
etc.
What I would like to have is an extra rank field like:
Id Year Valid Rank
1 2011 1 1
1 2010 1 1
1 2009 0 2
1 2002 1 3
4 2013 1 1
4 2012 1 1
4 2011 1 1
etc.
Basically per Id alternating ranks for every change in the Valid field. In such a way I can query on the rank=1 field to have all Valid=1 records for each and every Id up to the first Valid=0. Or is there an easier way to select the first top number of records matching a certain condition (For Id=1 only the first two records). I already played with ROW_NUMBER(), RANK() and PARTITION BY but I can't seem to get it to work. Must avoid nesting of queries since the actual query is run against a large database.
Anyone any ideas?
Thanks and cheers,
Nyquist
yes, using Left JOIN we can do that.
see the below code and result.
first image is actual data inserted and second image is expected result.
DECLARE @t TABLE
(
id INT
,_YEAR INT
,valid TINYINT
)
INSERT INTO @t( id, [_YEAR], valid )
SELECT 1,2011,1
UNION ALL SELECT 1,2010,1
UNION ALL SELECT 1,2009,0
UNION ALL SELECT 1,2002,1
UNION ALL SELECT 4,2013,1
UNION ALL SELECT 4,2012,1
UNION ALL SELECT 4,2011,1
UNION ALL SELECT 5,2013,0
UNION ALL SELECT 5,2011,1
UNION ALL SELECT 5,2010,1
UNION ALL SELECT 6,2010,1
UNION ALL SELECT 6,2011,0
UNION ALL SELECT 6,2014,1
SELECT q1.*
FROM @t q1
LEFT JOIN
(
SELECT id,MAX(_YEAR) ZeroYear
FROM @t
WHERE valid = 0
GROUP BY id
)q2
ON q1.id=q2.id
WHERE
(q2.ID IS NULL)
OR
(q2.id IS NOT NULL AND q1.id IS NOT NULL AND q1.id=q2.id AND q1.[_YEAR] > q2.ZeroYear)
Edit-1:
In above query for the column ZeroYear, previously i did MIN(_YEAR) but as you can see in the comment from "Andriy M" instead of MIN right function is MAX.
This is somewhat similar to @Anup Shah's suggestion but doesn't use a join and instead uses a window aggregate function:
WITH derived AS (
SELECT
Id,
Year,
Valid,
LatestInvalidYear = ISNULL(
MAX(CASE Valid WHEN 0 THEN Year END) OVER (PARTITION BY Id),
0
)
FROM atable
)
SELECT
Id,
Year,
Valid
FROM derived
WHERE Year > LatestInvalidYear
;
Basically, the window MAX calculates the latest Valid = 0
year per Id
. If no such year is found, MAX results in a NULL, which is replaced with a 0 by ISNULL. So, for your example, the derived
set would be returned as this:
Id Year Valid LatestInvalidYear
-- ---- ----- -----------------
1 2011 1 2009
1 2010 1 2009
1 2009 0 2009
1 2002 1 2009
4 2013 1 0
4 2012 1 0
4 2011 1 0
Obviously, you can now easily apply the filter Year > LatestInvalidYear
to get the required rows, and that is exacly what the main SELECT does.
If you're using SQL 2012, you can use lag
select id, year, valid,
case when ch = 0 then 1 else lag(ch,1,0) over (order by id, year desc) + 2 end rank
from
(
select
* ,
abs(valid - lag(valid,1,1) over (order by id, year desc)) as ch
from YourTable
) t