Delete multiple duplicate rows in table

I have multiple groups of duplicates in one table (3 records for one, 2 for another, etc) - multiple rows where more than 1 exists.

Below is what I came up with to delete them, but I have to run the script for however many duplicates there are:

set rowcount 1
delete from Table
where code in (
  select code from Table 
  group by code
  having (count(code) > 1)
)
set rowcount 0

This works well to a degree. I need to run this for every group of duplicates, and then it only deletes 1 (which is all I need right now).

标签： sql sql-server tsql

4条回答

混吃等死

2楼-- · 2020-07-17 06:32

SET ROWCOUNT 1    
DELETE Table    
FROM Table a    
WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1    
WHILE @@rowcount > 0    
  DELETE Table    
  FROM Table a    
  WHERE (SELECT COUNT(*) FROM Table b WHERE b.Code = a.Code ) > 1    
SET ROWCOUNT 0

this will delete all duplicate rows, But you can add attributes if you want to compare according to them .

0人赞添加讨论(0) 举报

Emotional °昔

3楼-- · 2020-07-17 06:34

It is frequently more efficient to copy unique rows into temporary table,
drop source table, rename back temporary table.

I reused the definition and data of #TempTable, called here as SrcTable instead, since it is impossible to rename temporary table into a regular one)

create table SrcTable
(
    ID int identity(1,1) not null primary key,
    SomeData varchar(100) not null
)

insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData1')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData2')
insert into SrcTable(SomeData) values('someData3')
insert into SrcTable(SomeData) values('someData4')

by John Sansom in previous answer

-- cloning "unique" part
SELECT * INTO TempTable 
FROM SrcTable --original table
WHERE id IN  
(SELECT MAX(id) AS ID
FROM SrcTable
GROUP BY SomeData);
GO;

DROP TABLE SrcTable
GO;

sys.sp_rename 'TempTable', 'SrcTable'

0人赞添加讨论(0) 举报

劫难

4楼-- · 2020-07-17 06:39

If you have a key column on the table, then you can use this to uniquely identify the "distinct" rows in your table.

Just use a sub query to identify a list of ID's for unique rows and then delete everything outside of this set. Something along the lines of.....

create table #TempTable
(
    ID int identity(1,1) not null primary key,
    SomeData varchar(100) not null
)

insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData1')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData2')
insert into #TempTable(SomeData) values('someData3')
insert into #TempTable(SomeData) values('someData4')

select * from #TempTable

--Records to be deleted
SELECT ID
FROM #TempTable
WHERE ID NOT IN
(
    select MAX(ID)
    from #TempTable
    group by SomeData
)

--Delete them
DELETE
FROM #TempTable
WHERE ID NOT IN
(
    select MAX(ID)
    from #TempTable
    group by SomeData
)

--Final Result Set
select * from #TempTable

drop table #TempTable;

Alternatively you could use a CTE for example:

WITH UniqueRecords AS
(
    select MAX(ID) AS ID
    from #TempTable
    group by SomeData
)
DELETE A
FROM #TempTable A
    LEFT outer join UniqueRecords B on
        A.ID = B.ID
WHERE B.ID IS NULL

0人赞添加讨论(0) 举报

Viruses.

5楼-- · 2020-07-17 06:40

You can alternatively use ROW_NUMBER() function to filter out duplicates

;WITH [CTE_DUPLICATES] AS 
(
SELECT RN = ROW_NUMBER() OVER (PARTITION BY SomeData ORDER BY SomeData)
FROM #TempTable
) 
DELETE FROM [CTE_DUPLICATES] WHERE RN > 1

0人赞添加讨论(0) 举报

Delete multiple duplicate rows in table

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间