I want to pull out duplicate records in a MySQL Database. This can be done with:
SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1
Which results in:
100 MAIN ST 2
I would like to pull it so that it shows each row that is a duplicate. Something like:
JIM JONES 100 MAIN ST
JOHN SMITH 100 MAIN ST
Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.
we can found the duplicates depends on more then one fields also.For those cases you can use below format.
The key is to rewrite this query so that it can be used as a subquery.
Isn't this easier :
?
I tried the best answer chosen for this question, but it confused me somewhat. I actually needed that just on a single field from my table. The following example from this link worked out very well for me:
Replace city with your Table. Replace name with your field name
Finding duplicate addresses is much more complex than it seems, especially if you require accuracy. A MySQL query is not enough in this case...
I work at SmartyStreets, where we do address validation and de-duplication and other stuff, and I've seen a lot of diverse challenges with similar problems.
There are several third-party services which will flag duplicates in a list for you. Doing this solely with a MySQL subquery will not account for differences in address formats and standards. The USPS (for US address) has certain guidelines to make these standard, but only a handful of vendors are certified to perform such operations.
So, I would recommend the best answer for you is to export the table into a CSV file, for instance, and submit it to a capable list processor. One such is LiveAddress which will have it done for you in a few seconds to a few minutes automatically. It will flag duplicate rows with a new field called "Duplicate" and a value of
Y
in it.