Eliminate row that has null value using CTE

2019-09-05 04:00发布

I have some data as

RowIdentifier   ID  RowID   Position    Data    Rn
1             1      1       a1          A1      1
2             1      2       a2          A2      1
3             1      3       a3          NULL    1
4             1      4       a3          A3      2
5             1      1       b1          B1      1
6             1      2       b2          NULL    1
7             1      3       b2          B2      2
8             1      4       b3          B3      1

The desired output being

ID  RowID  Position  Data
1     1     a1       A1
1     1     b1       B1
1     2     a2       A2
1     2     b2       B2
1     3     a3       A3
1     3     b3       B3

I need to eliminate those rows where the Positions are duplicate and whose data are null. i.e. in the example, in RowIdentifier 3 and 4, the value in Position column is a3 but the thired RowIdentifier record will not appear in the final output as it has null in Data column.

The ddl is as under

Declare @t table(RowIdentifier int identity,ID int,RowID int,Position varchar(10),Data varchar(10),Rn int)
    Insert into @t
    Select 1,1,'a1','A1',1 union all
    Select 1,2,'a2','A2',1 union all
    Select 1,3,'a3',null,1 union all
    Select 1,4,'a3','A3',2 union all
    Select 1,1,'b1','B1',1 union all
    Select 1,2,'b2',null,1 union all
    Select 1,3,'b2','B2',2 union all
    Select 1,4,'b3','B3',1 

    Select * from @t

My approach is as under

;with cte as(
Select ID,RowID,Position,Position as p2,Data,RowIdentifier from @t
union all
select c4.ID,c4.RowID,c4.Position,c5.Position , c4.Data,c4.RowIdentifier
from cte c5
join @t c4 on c4.Position = c5.Position
where c5.RowIdentifier < c4.RowIdentifier
 )
 ,
 cte2 as(
 select * , rn = Row_Number() over(PARTITION by position order by RowIdentifier)
 from cte where Data is not null)

select ID,RowID,Position,Data from cte2 where rn =1

But not working as per the expected output. My output is

ID  RowID   Position    Data
1   1   a1  A1
1   2   a2  A2
1   4   a3  A3
1   1   b1  B1
1   3   b2  B2
1   4   b3  B3

Help needed

Thanks

2条回答
狗以群分
2楼-- · 2019-09-05 04:58

It's a simple aggregate after a filter

SELECT
   ID, MIN(RowID) AS RowID, POSITION, Data
FROM
   @t
WHERE
   Data IS NOT NULL
GROUP BY
   ID, RowID, POSITION, Data
ORDER BY
   POSITION, RowID

This preserves RowID from the original dataset which would be more correct

3             1      3       a3          NULL    1
4             1      4       a3          A3      2   --take this row
查看更多
太酷不给撩
3楼-- · 2019-09-05 05:06

Try this code

Select 
    ID,
    dense_rank() over(order by substring(data,2,len(data))*1) as rowid,
    position,
    data 
from 
    @t 
where 
    data is not null
group by 
    ID,RowID,position,data
查看更多
登录 后发表回答