How can you find the rows with equal columns?

If I have a table with important 2 columns,

CREATE TABLE foo (id INT, a INT, b INT, KEY a, KEY b);

How can I find all the rows that have both a and b being the same in both rows? For example, in this data set

id | a | b
----------
1  | 1 | 2
2  | 5 | 42
3  | 1 | 42
4  | 1 | 2 
5  | 1 | 2
6  | 1 | 42

I want to get back all rows except for id=2 since it is unique in (a,b). Basically, I want to find all offending rows that would stop a

ALTER TABLE foo ADD UNIQUE (a, b);

Something better than an n^2 for loop would be nice since my table has 10M rows.

For bonus points : How do I removed all but one of the rows (I don't care which ones, as long as one is left)

标签： sql mysql aggregate

8条回答

贼婆χ

2楼-- · 2019-06-24 14:28

select * from foo where a = b

Or am I missing something?

===

Update for clarity:

select * from 
foo as a
inner join foo as b
on a.a = b.a AND b.a = b.b
and a.id != b.id

++++++++++ After 3rd clarity edit:

select f1.id
FROM foo as f1
INNER JOIN foo as f2
ON f1.a = f2.a AND f1.b=f2.b AND f1.id != f2.id

But I'm shot, so check it yourself.

0人赞添加讨论(0) 举报

狗以群分

3楼-- · 2019-06-24 14:32

SELECT * 
FROM foo first
JOIN foo second
  ON ( first.a = second.a
       AND first.b = second.b ) 
  AND (first.id <> second.id )

Should come up with all the rows where more that one row has the same combination of a and b.

Just hope you have an index on columns a and b.

0人赞添加讨论(0) 举报

我欲成王，谁敢阻挡

4楼-- · 2019-06-24 14:32

shouldn't this work?

SELECT * FROM foo WHERE a = b

=== edit ===

the how about

SELECT a, b FROM foo GROUP BY a, b HAVING COUNT(*) > 1

=== final re-edit before i give up on this question ===

SELECT foo.* FROM foo, (
   SELECT a, b FROM foo GROUP BY a, b HAVING COUNT(*) > 1
) foo2
WHERE foo.a = foo2.a AND foo.b = foo2.b

0人赞添加讨论(0) 举报

Luminary・发光体

5楼-- · 2019-06-24 14:36

Try this:

    With s as (Select a,b from foo group by a,b having Count(1)>1)
Select foo.* from foo,s where foo.a=s.a and foo.b=s.b

This query should show duplicate rows in the table foo.

0人赞添加讨论(0) 举报

对你真心纯属浪费

6楼-- · 2019-06-24 14:38

Your stated goal is to remove all duplicate combination of (a,b). For that, you can use a multi-table DELETE:

DELETE t1
  FROM foo t1
  JOIN foo t2 USING (a, b)
 WHERE t2.id > t1.id

Before you run it, you can check which rows will be removed with:

SELECT DISTINCT t1.id
  FROM foo t1
  JOIN foo t2 USING (a, b)
 WHERE t2.id > t1.id

The WHERE clause being t2.id > t1.id it will remove all but the one with the highest value for id. In your case, only the rows with id equal to 2, 5 or 6 would remain.

0人赞添加讨论(0) 举报

放我归山

7楼-- · 2019-06-24 14:40

here's another approach

select * from foo f1 where exists(
  select * from foo f2 where
    f1.id != f2.id and
    f1.a = f2.a and
    f1.b = f2.b )

anyway, even though I find it a bit more readable, if you have such a huge table, you should check the execution plan, subqueries have a bad reputation involving performance...

you should also consider creating the index (without the unique clause, obviously) to speed up the query... for huge operations, sometimes it's better to spend the time creating the index, perform the update and then drop the index... in this case, I guess an index on (a, b) should certainly help a lot...

0人赞添加讨论(0) 举报

1 2 下一页

How can you find the rows with equal columns?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间