Find duplicate records in MySQL

2018-12-31 03:39发布

I want to pull out duplicate records in a MySQL Database. This can be done with:

SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1

Which results in:

100 MAIN ST    2

I would like to pull it so that it shows each row that is a duplicate. Something like:

JIM    JONES    100 MAIN ST
JOHN   SMITH    100 MAIN ST

Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.

22条回答
看风景的人
2楼-- · 2018-12-31 03:52

we can found the duplicates depends on more then one fields also.For those cases you can use below format.

SELECT COUNT(*), column1, column2 
FROM tablename
GROUP BY column1, column2
HAVING COUNT(*)>1;
查看更多
梦醉为红颜
3楼-- · 2018-12-31 03:53

The key is to rewrite this query so that it can be used as a subquery.

SELECT firstname, 
   lastname, 
   list.address 
FROM list
   INNER JOIN (SELECT address
               FROM   list
               GROUP  BY address
               HAVING COUNT(id) > 1) dup
           ON list.address = dup.address;
查看更多
君临天下
4楼-- · 2018-12-31 03:53

Isn't this easier :

SELECT *
FROM tc_tariff_groups
GROUP BY group_id
HAVING COUNT(group_id) >1

?

查看更多
萌妹纸的霸气范
5楼-- · 2018-12-31 03:55

I tried the best answer chosen for this question, but it confused me somewhat. I actually needed that just on a single field from my table. The following example from this link worked out very well for me:

SELECT COUNT(*) c,title FROM `data` GROUP BY title HAVING c > 1;
查看更多
梦醉为红颜
6楼-- · 2018-12-31 03:56
SELECT t.*,(select count(*) from city as tt where tt.name=t.name) as count FROM `city` as t where (select count(*) from city as tt where tt.name=t.name) > 1 order by count desc

Replace city with your Table. Replace name with your field name

查看更多
步步皆殇っ
7楼-- · 2018-12-31 03:58

Finding duplicate addresses is much more complex than it seems, especially if you require accuracy. A MySQL query is not enough in this case...

I work at SmartyStreets, where we do address validation and de-duplication and other stuff, and I've seen a lot of diverse challenges with similar problems.

There are several third-party services which will flag duplicates in a list for you. Doing this solely with a MySQL subquery will not account for differences in address formats and standards. The USPS (for US address) has certain guidelines to make these standard, but only a handful of vendors are certified to perform such operations.

So, I would recommend the best answer for you is to export the table into a CSV file, for instance, and submit it to a capable list processor. One such is LiveAddress which will have it done for you in a few seconds to a few minutes automatically. It will flag duplicate rows with a new field called "Duplicate" and a value of Y in it.

查看更多
登录 后发表回答