Deleting duplicate rows with sql

2020-03-31 07:19发布

问题:

I am trying to delete duplicate rows from my mysql table. I've tried multiple queries but I am keep on getting this error: #1093 - You can't specify target table 'usa_city' for update in FROM clause

The table looks like this:

usa_city
--------
id(pk)
id_state
city_name

And the queries I have tired were:

DELETE FROM usa_city
WHERE id NOT IN
(
SELECT MIN(id)
FROM usa_city
GROUP BY city_name, id_state
)

And:

DELETE
FROM usa_city
WHERE usa_city.id IN

-- List 1 - all rows that have duplicates
(SELECT F.id
FROM usa_city AS F
WHERE Exists (SELECT city_name, id_state, Count(id)
FROM usa_city
WHERE usa_city.city_name = F.city_name
   AND usa_city.id_state = F.id_state
GROUP BY usa_city.city_name, usa_city.id_state
HAVING Count(usa_city.id) > 1))
AND usa_city.id NOT IN

-- List 2 - one row from each set of duplicate
(SELECT Min(id)
FROM usa_city AS F
WHERE Exists (SELECT city_name, id_state, Count(id)
FROM usa_city
WHERE usa_city.city_name = F.city_name
   AND usa_city.id_state = F.id_state
GROUP BY usa_city.city_name, usa_city.id_state
HAVING Count(usa_city.id) > 1)
GROUP BY city_name, id_state);

Thanks in advance.

回答1:

Try to select the duplicates first, the delete them

DELETE FROM usa_city WHERE city_id IN
(
SELECT city_id FROM usa_city
GROUP BY city_name, id_state
HAVING count(city_id) > 1
)

Hope it helps!!!

MODIFIED: Based on the comment, if you want to keep one record, you can make a join and keep the lowest value

DELETE c1 FROM usa_city c1, usa_city c2 WHERE c1.id < c2.id AND 
(c1.city_name= c2.city_name AND c1.id_state = c2.id_state)

Be sure to make a backup before executing the query above...



回答2:

from mysql documentation:

"Currently, you cannot delete from a table and select from the same table in a subquery."

but here is a workaround for update, should work for delete too.

also, you could select rows, and then in php for example delete them in loop



回答3:

You may found here an answer to your problem: How to delete duplicate records in mysql database?

You should improve your database by using keyfields to prevent duplicate rows, so you dont need to clear in future.



回答4:

Edit : This solution is also found if you follow the link posted by BloodyWorld, so if it works please go and upvote DMin's post here

Found this browsing the internet (#1 google result for mysql delete duplicate rows), have you tried it?

delete from table1
USING table1, table1 as vtable
WHERE (NOT table1.ID=vtable.ID)
AND (table1.field_name=vtable.field_name)


回答5:

Judging from your examples, when you say "duplicate", you mean "having the same combination of id_state and city_name", correct? If so after you have done removing the duplictes, I strongly suggest creating a UNIQUE constraint on {id_state, city_name}.

To actually remove the duplicates, it is not enough to just identify the set of duplicates, you must also decide which of the identified duplicates to keep. Assuming you want to keep the ones with the smallest id, the following piece of SQL will do the job:

CREATE TEMPORARY TABLE usa_city_to_delete AS
    SELECT id FROM usa_city T1
    WHERE EXISTS (
        SELECT * FROM usa_city T2
        WHERE
            T1.id_state = T2.id_state 
            AND T1.city_name = T2.city_name
            AND T1.id > T2.id
    );

DELETE FROM usa_city
WHERE id IN (SELECT id FROM usa_city_to_delete);

DROP TEMPORARY TABLE usa_city_to_delete;

Unfortunately, MySQL does not allow the correlated subqueries in DELETE, otherwise we could have done that in a single statement, without the temporary table.

--- EDIT ---

You can't have a correlated subquery but you can have JOIN, as illustrated by Carlos Quijano answer. Also, the temporary table can be created implicitly, as suggested by Kokers.

So it is possible to do it in a single statement, contrary to what I wrote above...