How to delete duplicates on a MySQL table?

2018-12-31 04:58发布

I need to DELETE duplicated rows for specified sid on a MySQL table.

How can I do this with an SQL query?

DELETE (DUPLICATED TITLES) FROM table WHERE SID = "1"

Something like this, but I don't know how to do it.

22条回答
人间绝色
2楼-- · 2018-12-31 05:24

After running into this issue myself, on a huge database, I wasn't completely impressed with the performance of any of the other answers. I want to keep only the latest duplicate row, and delete the rest.

In a one-query statement, without a temp table, this worked best for me,

DELETE e.*
FROM employee e
WHERE id IN
 (SELECT id
   FROM (SELECT MIN(id) as id
          FROM employee e2
          GROUP BY first_name, last_name
          HAVING COUNT(*) > 1) x);

The only caveat is that I have to run the query multiple times, but even with that, I found it worked better for me than the other options.

查看更多
不再属于我。
3楼-- · 2018-12-31 05:24

The following works for all tables

CREATE TABLE `noDup` LIKE `Dup` ;
INSERT `noDup` SELECT DISTINCT * FROM `Dup` ;
DROP TABLE `Dup` ;
ALTER TABLE `noDup` RENAME `Dup` ;
查看更多
只若初见
4楼-- · 2018-12-31 05:26

Deleting duplicates on MySQL tables is a common issue, that usually comes with specific needs. In case anyone is interested, here (Remove duplicate rows in MySQL) I explain how to use a temporary table to delete MySQL duplicates in a reliable and fast way, also valid to handle big data sources (with examples for different use cases).

Ali, in your case, you can run something like this:

-- create a new temporary table
CREATE TABLE tmp_table1 LIKE table1;

-- add a unique constraint    
ALTER TABLE tmp_table1 ADD UNIQUE(sid, title);

-- scan over the table to insert entries
INSERT IGNORE INTO tmp_table1 SELECT * FROM table1 ORDER BY sid;

-- rename tables
RENAME TABLE table1 TO backup_table1, tmp_table1 TO table1;
查看更多
素衣白纱
5楼-- · 2018-12-31 05:26

You could just use a DISTINCT clause to select the "cleaned up" list (and here is a very easy example on how to do that).

查看更多
何处买醉
6楼-- · 2018-12-31 05:30

Here is a simple answer:

delete a from target_table a left JOIN (select max(id_field) as id, field_being_repeated  
    from target_table GROUP BY field_being_repeated) b 
    on a.field_being_repeated = b.field_being_repeated
      and a.id_field = b.id_field
    where b.id_field is null;
查看更多
骚的不知所云
7楼-- · 2018-12-31 05:31

Another easy way... using UPDATE IGNORE:

U have to use an index on one or more columns (type index). Create a new temporary reference column (not part of the index). In this column, you mark the uniques in by updating it with ignore clause. Step by step:

Add a temporary reference column to mark the uniques:

ALTER TABLE `yourtable` ADD `unique` VARCHAR(3) NOT NULL AFTER `lastcolname`;

=> this will add a column to your table.

Update the table, try to mark everything as unique, but ignore possible errors due to to duplicate key issue (records will be skipped):

UPDATE IGNORE `yourtable` SET `unique` = 'Yes' WHERE 1;

=> you will find your duplicate records will not be marked as unique = 'Yes', in other words only one of each set of duplicate records will be marked as unique.

Delete everything that's not unique:

DELETE * FROM `yourtable` WHERE `unique` <> 'Yes';

=> This will remove all duplicate records.

Drop the column...

ALTER TABLE `yourtable` DROP `unique`;
查看更多
登录 后发表回答