How to delete duplicate rows in sql server?

How can I delete duplicate rows where no unique row id exists?

My table is

col1  col2 col3 col4 col5 col6 col7
john  1    1    1    1    1    1 
john  1    1    1    1    1    1
sally 2    2    2    2    2    2
sally 2    2    2    2    2    2

I want to be left with the following after the duplicate removal:

john  1    1    1    1    1    1
sally 2    2    2    2    2    2

I've tried a few queries but i think they depend on a row id as I don't get desired result. For example:

DELETE FROM table WHERE col1 IN (
    SELECT id FROM table GROUP BY id HAVING ( COUNT(col1) > 1 )
)

标签： sql sql-server-2008 tsql duplicates sql-delete

15条回答

孤独寂梦人

2楼-- · 2018-12-31 10:16

DELETE from search
where id not in (
   select min(id) from search
   group by url
   having count(*)=1

   union

   SELECT min(id) FROM search
   group by url
   having count(*) > 1
)

0人赞添加讨论(0) 举报

不流泪的眼

3楼-- · 2018-12-31 10:16

Please see the below way of deletion too.

Declare @table table
(col1 varchar(10),col2 int,col3 int, col4 int, col5 int, col6 int, col7 int)
Insert into @table values 
('john',1,1,1,1,1,1),
('john',1,1,1,1,1,1),
('sally',2,2,2,2,2,2),
('sally',2,2,2,2,2,2)

Created a sample table named @table and loaded it with given data.

Delete  aliasName from (
Select  *,
        ROW_NUMBER() over (Partition by col1,col2,col3,col4,col5,col6,col7 order by col1) as rowNumber
From    @table) aliasName 
Where   rowNumber > 1

Select * from @table

Note: If you are giving all columns in the Partition by part, then order by do not have much significance.

I know, the question is asked three years ago, and my answer is another version of what Tim has posted, But posting just incase it is helpful for anyone.

0人赞添加讨论(0) 举报

笑指拈花

4楼-- · 2018-12-31 10:17

Oh wow, i feel so stupid by ready all this answers, they are like experts' answer with all CTE and temp table and etc.

And all I did to get it working was simply aggregated the ID column by using MAX.

DELETE FROM table WHERE col1 IN (
    SELECT MAX(id) FROM table GROUP BY id HAVING ( COUNT(col1) > 1 )
)

NOTE: you might need to run it multiple time to remove duplicate as this will only delete one set of duplicate rows at a time.

0人赞添加讨论(0) 举报

深知你不懂我心

5楼-- · 2018-12-31 10:20

If you can find number of duplicate rows, for instance you have n duplicate row, then use this command

SET rowcount n-1
DELETE FROM your_table
WHERE (spacial condition)

for more info I suggest this

0人赞添加讨论(0) 举报

若你有天会懂

6楼-- · 2018-12-31 10:27

Another way of removing dublicated rows without loosing information in one step is like following:

delete from dublicated_table t1 (nolock)
join (
    select t2.dublicated_field
    , min(len(t2.field_kept)) as min_field_kept
    from dublicated_table t2 (nolock)
    group by t2.dublicated_field having COUNT(*)>1
) t3 
on t1.dublicated_field=t3.dublicated_field 
    and len(t1.field_kept)=t3.min_field_kept

0人赞添加讨论(0) 举报

姐姐魅力值爆表

7楼-- · 2018-12-31 10:28

I like CTEs and ROW_NUMBER as the two combined allow us to see which rows are deleted (or updated), therefore just change the DELETE FROM CTE... to SELECT * FROM CTE:

WITH CTE AS(
   SELECT [col1], [col2], [col3], [col4], [col5], [col6], [col7],
       RN = ROW_NUMBER()OVER(PARTITION BY col1 ORDER BY col1)
   FROM dbo.Table1
)
DELETE FROM CTE WHERE RN > 1

DEMO (result is different; I assume that it's due to a typo on your part)

COL1    COL2    COL3    COL4    COL5    COL6    COL7
john    1        1       1       1       1       1
sally   2        2       2       2       2       2

This example determines duplicates by a single column col1 because of the PARTITION BY col1. If you want to include multiple columns simply add them to the PARTITION BY:

ROW_NUMBER()OVER(PARTITION BY Col1, Col2, ... ORDER BY OrderColumn)

0人赞添加讨论(0) 举报

1 2 3 下一页

How to delete duplicate rows in sql server?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间