How to delete duplicates on a MySQL table?

2018-12-31 04:58发布

I need to DELETE duplicated rows for specified sid on a MySQL table.

How can I do this with an SQL query?

DELETE (DUPLICATED TITLES) FROM table WHERE SID = "1"

Something like this, but I don't know how to do it.

22条回答
伤终究还是伤i
2楼-- · 2018-12-31 05:37

Love @eric's answer but it doesn't seem to work if you have a really big table (I'm getting The SELECT would examine more than MAX_JOIN_SIZE rows; check your WHERE and use SET SQL_BIG_SELECTS=1 or SET MAX_JOIN_SIZE=# if the SELECT is okay when I try to run it). So I limited the join query to only consider the duplicate rows and I ended up with:

DELETE a FROM penguins a
    LEFT JOIN (SELECT COUNT(baz) AS num, MIN(baz) AS keepBaz, foo
        FROM penguins
        GROUP BY deviceId HAVING num > 1) b
        ON a.baz != b.keepBaz
        AND a.foo = b.foo
    WHERE b.foo IS NOT NULL

The WHERE clause in this case allows MySQL to ignore any row that doesn't have a duplicate and will also ignore if this is the first instance of the duplicate so only subsequent duplicates will be ignored. Change MIN(baz) to MAX(baz) to keep the last instance instead of the first.

查看更多
孤独寂梦人
3楼-- · 2018-12-31 05:38

Following remove duplicates for all SID-s, not only single one.

With temp table

CREATE TABLE table_temp AS
SELECT * FROM table GROUP BY title, SID;

DROP TABLE table;
RENAME TABLE table_temp TO table;

Since temp_table is freshly created it has no indexes. You'll need to recreate them after removing duplicates. You can check what indexes you have in the table with SHOW INDEXES IN table

Without temp table:

DELETE FROM `table` WHERE id IN (
  SELECT all_duplicates.id FROM (
    SELECT id FROM `table` WHERE (`title`, `SID`) IN (
      SELECT `title`, `SID` FROM `table` GROUP BY `title`, `SID` having count(*) > 1
    )
  ) AS all_duplicates 
  LEFT JOIN (
    SELECT id FROM `table` GROUP BY `title`, `SID` having count(*) > 1
  ) AS grouped_duplicates 
  ON all_duplicates.id = grouped_duplicates.id 
  WHERE grouped_duplicates.id IS NULL
)
查看更多
与君花间醉酒
4楼-- · 2018-12-31 05:39

Could it work if you count them, and then add a limit to your delete query leaving just one?

For example, if you have two or more, write your query like this:

DELETE FROM table WHERE SID = 1 LIMIT 1;
查看更多
孤独总比滥情好
5楼-- · 2018-12-31 05:40

this removes duplicates in place, without making a new table

ALTER IGNORE TABLE `table_name` ADD UNIQUE (title, SID)

note: only works well if index fits in memory

查看更多
大哥的爱人
6楼-- · 2018-12-31 05:43

I think this will work by basically copying the table and emptying it then putting only the distinct values back into it but please double check it before doing it on large amounts of data.

Creates a carbon copy of your table

create table temp_table like oldtablename; insert temp_table select * from oldtablename;

Empties your original table

DELETE * from oldtablename;

Copies all distinct values from the copied table back to your original table

INSERT oldtablename SELECT * from temp_table group by firstname,lastname,dob

Deletes your temp table.

Drop Table temp_table

You need to group by aLL fields that you want to keep distinct.

查看更多
妖精总统
7楼-- · 2018-12-31 05:46

Suppose you have a table employee, with the following columns:

employee (first_name, last_name, start_date)

In order to delete the rows with a duplicate first_name column:

delete
from employee using employee,
    employee e1
where employee.id > e1.id
    and employee.first_name = e1.first_name  
查看更多
登录 后发表回答