SQL Server的重复记录(SQL Server Duplicate Records)

2019-10-19 07:33发布

你好,我已经做了以下以下查询:

UPDATE [dbo].[TestData]
SET Duplicate = 'Duplicate within'
WHERE exists 
(SELECT telephone, COUNT(telephone)
FROM [dbo].[TestData]
GROUP BY telephone
HAVING (COUNT (telephone)>1))

在该表中,实际上有9个重复的电话记录。

该查询冲压整个重复列“中的重复”,而不是9条。

下届查询我还开发了将unstamp 18条重复记录9。

UPDATE [dbo].[TestData]
SET Duplicate = 'NO'
WHERE ID IN (SELECT MIN(ID) FROM [dbo].[TestData] GROUP BY telephone)

此查询不工作既不是任何人都可以请指导我在哪里,我错了!

Answer 1:

你可以做到这一点使用其中存在的,但它更容易读/写这样一来,性能差异很可能是最小的。

update TestData set 
    Duplicate = 'Duplicate within'
where 
    Telephone in (
        select Telephone 
        from TestData 
        group by Telephone 
        having count(*) > 1
    )

离开单独各电话号码的第一个记录并标记只能用同一个电话号码随后的记录,使用一个CTE如下:

;with NumberedDupes as (
    select
        Telephone,
        Duplicate,
        row_number() over (partition by Telephone order by Telephone) seq
    from TestData
)
update NumberedDupes set Duplicate = 'Duplicate within' where seq > 1


Answer 2:

问题是, EXISTS没有被过滤查询。 它需要每个电话号码进行过滤:

UPDATE [dbo].[TestData]
SET Duplicate = 'Duplicate within'
FROM [TestData] t
WHERE EXISTS (
    SELECT telephone, COUNT(telephone)
    FROM [dbo].[TestData]
    WHERE telephone = t.telephone
    GROUP BY telephone
    HAVING (COUNT (telephone)>1))
)


Answer 3:

如果你只是想找到重复的,你需要看的两个记录一个这表现在子下面选择。 该EXISTS实际上将让你更新这两个行,因为那是你正在测试的内容。

    UPDATE [dbo].[TestData]
    SET Duplicate = 'Duplicate within'
    WHERE Id IN  
    (SELECT MAX(Id)
    FROM [dbo].[TestData]
    GROUP BY telephone
    HAVING (COUNT (telephone)>1))


文章来源: SQL Server Duplicate Records