I have a SQL Server database of organizations, and there are many duplicate rows. I want to run a select statement to grab all of these and the amount of dupes, but also return the ids that are associated with each organization.
A statement like:
SELECT orgName, COUNT(*) AS dupes
FROM organizations
GROUP BY orgName
HAVING (COUNT(*) > 1)
Will return something like
orgName | dupes
ABC Corp | 7
Foo Federation | 5
Widget Company | 2
But I'd also like to grab the IDs of them. Is there any way to do this? Maybe like a
orgName | dupeCount | id
ABC Corp | 1 | 34
ABC Corp | 2 | 5
...
Widget Company | 1 | 10
Widget Company | 2 | 2
The reason being that there is also a separate table of users that link to these organizations, and I would like to unify them (therefore remove dupes so the users link to the same organization instead of dupe orgs). But I would like part manually so I don't screw anything up, but I would still need a statement returning the IDs of all the dupe orgs so I can go through the list of users.
The solution marked as correct didn't work for me, but I found this answer that worked just great: Get list of duplicate rows in MySql
You can run the following query and find the duplicates with
max(id)
and delete those rows.But you'll have to run this query a few times.
i think i know what you need i needed to mix between the answers and i think i got the solution he wanted:
having the max id will give you the id of the dublicate and the one of the original which is what he asked for:
only sad thing you get it put out in this form
hope it still helps
You can do it like this:
If you want to return just the records that can be deleted (leaving one of each), you can use:
Edit: SQL Server 2000 doesn't have the ROW_NUMBER() function. Instead, you can use:
If you want to delete duplicates: