Find duplicate records in MySQL

2018-12-31 03:39发布

I want to pull out duplicate records in a MySQL Database. This can be done with:

SELECT address, count(id) as cnt FROM list
GROUP BY address HAVING cnt > 1

Which results in:

100 MAIN ST    2

I would like to pull it so that it shows each row that is a duplicate. Something like:

JIM    JONES    100 MAIN ST
JOHN   SMITH    100 MAIN ST

Any thoughts on how this can be done? I'm trying to avoid doing the first one then looking up the duplicates with a second query in the code.

22条回答
梦该遗忘
2楼-- · 2018-12-31 04:05

Why not just INNER JOIN the table with itself?

SELECT a.firstname, a.lastname, a.address
FROM list a
INNER JOIN list b ON a.address = b.address
WHERE a.id <> b.id

A DISTINCT is needed if the address could exist more than two times.

查看更多
十年一品温如言
3楼-- · 2018-12-31 04:06

This will select duplicates in one table pass, no subqueries.

SELECT  *
FROM    (
        SELECT  ao.*, (@r := @r + 1) AS rn
        FROM    (
                SELECT  @_address := 'N'
                ) vars,
                (
                SELECT  *
                FROM
                        list a
                ORDER BY
                        address, id
                ) ao
        WHERE   CASE WHEN @_address <> address THEN @r := 0 ELSE 0 END IS NOT NULL
                AND (@_address := address ) IS NOT NULL
        ) aoo
WHERE   rn > 1

This query actially emulates ROW_NUMBER() present in Oracle and SQL Server

See the article in my blog for details:

查看更多
弹指情弦暗扣
4楼-- · 2018-12-31 04:09
SELECT date FROM logs group by date having count(*) >= 2
查看更多
后来的你喜欢了谁
5楼-- · 2018-12-31 04:10
    SELECT *
    FROM (SELECT  address, COUNT(id) AS cnt
    FROM list
    GROUP BY address
    HAVING ( COUNT(id) > 1 ))
查看更多
美炸的是我
6楼-- · 2018-12-31 04:12

Another solution would be to use table aliases, like so:

SELECT p1.id, p2.id, p1.address
FROM list AS p1, list AS p2
WHERE p1.address = p2.address
AND p1.id != p2.id

All you're really doing in this case is taking the original list table, creating two pretend tables -- p1 and p2 -- out of that, and then performing a join on the address column (line 3). The 4th line makes sure that the same record doesn't show up multiple times in your set of results ("duplicate duplicates").

查看更多
孤独寂梦人
7楼-- · 2018-12-31 04:16
 SELECT firstname, lastname, address FROM list
 WHERE 
 Address in 
 (SELECT address FROM list
 GROUP BY address
 HAVING count(*) > 1)
查看更多
登录 后发表回答