Mysql select where not in table

2019-01-07 06:18发布

I have 2 tables (A and B) with the same primary keys. I want to select all row that are in A and not in B. The following works:

select * from A where not exists (select * from B where A.pk=B.pk);

however it seems quite bad (~2 sec on only 100k rows in A and 3-10k less in B)

Is there a better way to run this? Perhaps as a left join?

select * from A left join B on A.x=B.y where B.y is null;

On my data this seems to run slightly faster (~10%) but what about in general?

5条回答
姐就是有狂的资本
2楼-- · 2019-01-07 06:32

I use queries in the format of your second example. A join is usually more scalable than a correlated subquery.

查看更多
来,给爷笑一个
3楼-- · 2019-01-07 06:38

This helped me a lot. Joins are always faster than Sub Queries to give results:

SELECT tbl1.id FROM tbl1 t1
LEFT OUTER JOIN tbl2 t2 ON t1.id = t2.id 
WHERE t1.id>=100 AND t2.id IS NULL ;
查看更多
淡お忘
4楼-- · 2019-01-07 06:41

I also use left joins with a "where table2.id is null" type criteria.

Certainly seems to be more efficient than the nested query option.

查看更多
男人必须洒脱
5楼-- · 2019-01-07 06:41

Joins are generally faster (in MySQL), but you should also consider your indexing scheme if you find that it's still moving slowly. Generally, any field setup as a foreign key (using INNODB) will already have an index set. If you're using MYISAM, make sure that any columns in the ON statement are indexed, and consider also adding any columns in the WHERE clause to the end of the index, to make it a covering index. This allows the engine to have access to all the data needed in the index, removing the need to make a second round-trip back to the original data. Keep in mind that this will impact the speed of inserts/updates/deletes, but can significantly increase the speed of the query.

查看更多
混吃等死
6楼-- · 2019-01-07 06:49

I think your last statement is the best way. You can also try

SELECT A.*    
from A left join B on 
    A.x = B.y
    where B.y is null
查看更多
登录 后发表回答