Delete duplicate records from a SQL table without

2019-01-04 22:45发布

I have the below table with the below records in it

create table employee
(
 EmpId number,
 EmpName varchar2(10),
 EmpSSN varchar2(11)
);

insert into employee values(1, 'Jack', '555-55-5555');
insert into employee values (2, 'Joe', '555-56-5555');
insert into employee values (3, 'Fred', '555-57-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
insert into employee values (1, 'Jack', '555-55-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6 ,'Lisa', '555-70-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');

I dont have any primary key in this table .But i have the above records in my table already. I want to remove the duplicate records which has the same value in EmpId and EmpSSN fields.

Ex : Emp id 5

Can any one help me to frame a query to delete those duplicate records

Thanks in advance

18条回答
叛逆
2楼-- · 2019-01-04 23:19

I'm not an SQL expert so bear with me. I'm sure you'll get a better answer soon enough. Here's how you can find the duplicate records.

select t1.empid, t1.empssn, count(*)
from employee as t1 
inner join employee as t2 on (t1.empid=t2.empid and t1.empssn = t2.empssn)
group by t1.empid, t1.empssn
having count(*) > 1

Deleting them will be more tricky because there is nothing in the data that you could use in a delete statement to differentiate the duplicates. I suspect the answer will involve row_number() or adding an identity column.

查看更多
smile是对你的礼貌
3楼-- · 2019-01-04 23:23
select t1.* from employee t1, employee t2 where t1.empid=t2.empid and t1.empname = t2.empname and t1.salary = t2.salary
group by t1.empid, t1.empname,t1.salary having count(*) > 1
查看更多
放我归山
4楼-- · 2019-01-04 23:24

Add a Primary Key (code below)

Run the correct delete (code below)

Consider WHY you woudln't want to keep that primary key.


Assuming MSSQL or compatible:

ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;

WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1)
BEGIN
    DELETE FROM Employee WHERE EmployeeID IN 
    (
        SELECT MIN(EmployeeID) as [DeleteID]
        FROM Employee
        GROUP BY EmpID, EmpSSN
        HAVING COUNT(*) > 1
    )
END
查看更多
再贱就再见
5楼-- · 2019-01-04 23:31

It is very simple. I tried in SQL Server 2008

DELETE SUB FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt
 FROM Employee) SUB
WHERE SUB.cnt > 1
查看更多
不美不萌又怎样
6楼-- · 2019-01-04 23:31
create unique clustered index Employee_idx
on Employee ( EmpId,EmpSSN )
with ignore_dup_key

You can drop the index if you don't need it.

查看更多
家丑人穷心不美
7楼-- · 2019-01-04 23:35

there are two columns in the a table ID and name where names are repeating with different IDs so for that you may use this query: . .

DELETE FROM dbo.tbl1
WHERE id NOT IN (
     Select MIN(Id) AS namecount FROM tbl1
     GROUP BY Name
)
查看更多
登录 后发表回答