It's easy to find duplicates
with one field:
SELECT name, COUNT(email)
FROM users
GROUP BY email
HAVING COUNT(email) > 1
So if we have a table
ID NAME EMAIL
1 John asd@asd.com
2 Sam asd@asd.com
3 Tom asd@asd.com
4 Bob bob@asd.com
5 Tom asd@asd.com
This query will give us John, Sam, Tom, Tom because they all have the same email
.
However, what I want is to get duplicates with the same email
and name
.
That is, I want to get "Tom", "Tom".
The reason I need this: I made a mistake, and allowed to insert duplicate name
and email
values. Now I need to remove/change the duplicates, so I need to find them first.
If you want to find duplicate data (by one or several criterias) and select the actual rows.
http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/
We can use having here which work on aggregate functions as shown below
Here as two fields id_account and data are used with Count(*). So, it will give all the records which has more than one times same values in both columns.
We some reason mistakely we had missed to add any constraints in SQL server table and the records has been inserted duplicate in all columns with front-end application. Then we can use below query to delete duplicate query from table.
Here we have taken all the distinct records of the orignal table and deleted the records of original table. Again we inserted all the distinct values from new table to the original table and then deleted new table.
SELECT column_name,COUNT(*) FROM TABLE_NAME GROUP BY column1, HAVING COUNT(*) > 1;
If you want to delete the duplicates, here's a much simpler way to do it than having to find even/odd rows into a triple sub-select:
And so to delete:
Much more easier to read and understand IMHO
Note: The only issue is that you have to execute the request until there is no rows deleted, since you delete only 1 of each duplicate each time