How do I delete duplicate data from SQL table

I am in the midst of uploading and updating my db from data from a third party source. Unfortunately, there are many duplicate records in the data from the third party data source.

I looked at a few questions here on SO but all of them seem to be cases where there is an ID column which differentiates one row from the other.

In my case, there is no ID column. e.g.

State   City    SubDiv  Pincode Locality Lat    Long
Orissa  Koraput Jeypore 764001  B.D.Pur 18.7743 82.5693
Orissa  Koraput Jeypore 764001  Jeypore 18.7743 82.5693
Orissa  Koraput Jeypore 764001  Jeypore 18.7743 82.5693
Orissa  Koraput Jeypore 764001  Jeypore 18.7743 82.5693
Orissa  Koraput Jeypore 764001  Jeypore 18.7743 82.5693

Is there a simple query which I can run to delete all duplicate records and keep one record as the original? So in the above case I want to delete rows 3,4,5 from the table.

I am not sure if this can be done using simple sql statements but would like to know others opinion how this can be done

标签： sql sql-server sql-server-2005 tsql sql-server-2008

5条回答

我只想做你的唯一

2楼-- · 2019-09-18 16:43

Try this

alter table mytable add id int identity(1,1)

delete  mytable  where id in (
select duplicateid from (select ROW_NUMBER() over (partition by State ,City ,SubDiv ,Pincode ,Locality ,Lat ,Long order by State ,City ,SubDiv ,Pincode ,Locality ,Lat ,Long ) duplicateid
from mytable) t where duplicateid !=1)

alter table mytable drop column id

0人赞添加讨论(0) 举报

Ridiculous、

3楼-- · 2019-09-18 16:45

;with cte as(
select State City, SubDiv, Pincode, Locality, Lat, Long, 
row_number() over (partition by City, SubDiv, Pincode, Locality, Lat,Long order by City) rn
from yourtable
)
delete cte where rn > 1

0人赞添加讨论(0) 举报

三岁会撩人

4楼-- · 2019-09-18 16:49

You may use the ROW_NUMBER() function : SQL SERVER – 2005 – 2008 – Delete Duplicate Rows

0人赞添加讨论(0) 举报

混吃等死

5楼-- · 2019-09-18 16:51

I would insert the third party data to a temporary table that then:

insert into
  target_table
select distinct
  *
from
  temporary_table

and finally delete the temporary table.

Only distinct (unique) rows will be inserted to the target table.

0人赞添加讨论(0) 举报

萌系小妹纸

6楼-- · 2019-09-18 16:52

One of

add a column to de-duplicate and leave it
do a SELECT DISTINCT * INTO ANewTable FROM OldTable and then rename etc
Use t-clausen.dk's CTE approach

And then add a unique index on the desired columns

0人赞添加讨论(0) 举报

How do I delete duplicate data from SQL table

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间