Why this mysql query (with is null check) is so sl

2019-06-02 11:23发布

I getting in doubt with "IS NULL" MySQL check. I have this 2 queries. The first one runs in about 300 seconds. The second one run less then 1 second!

Slow query:

SELECT count(distinct(u.id))
FROM ips_usuario AS u 
JOIN ips_fatura AS f
    ON ((u.id = f.ips_usuario_id) OR
       (u.ips_usuario_id_titular IS NOT NULL AND
        u.ips_usuario_id_titular = f.ips_usuario_id));

enter image description here

Fast query:

SELECT count(distinct(u.id))
FROM ips_usuario AS u 
JOIN ips_fatura AS f
    ON ((u.id = f.ips_usuario_id) OR
       (u.ips_usuario_id_titular = f.ips_usuario_id));

enter image description here

All join conditions use foreign keys indexed columns. The table ips_usuario have about 20.000 records and the table ips_fatura have about 500.000 records.

1条回答
The star\"
2楼-- · 2019-06-02 11:42

I am surprised that either is fast. I would suggest replacing them with exists:

SELECT COUNT(*)
FROM ips_usuario u  
WHERE EXISTS (SELECT 1 FROM ips_fatura f WHERE u.id = f.ips_usuario_id) OR
      EXISTS (SELECT 1 FROM ips_fatura f WHERE u.ips_usuario_id_titular = f.ips_usuario_id);

And for the second:

SELECT COUNT(*)
FROM ips_usuario u  
WHERE EXISTS (SELECT 1 FROM ips_fatura f WHERE u.id = f.ips_usuario_id) OR
      (u.ips_usuario_id_titular IS NOT NULL AND
       EXISTS (SELECT 1 FROM ips_fatura f WHERE u.ips_usuario_id_titular = f.ips_usuario_id)
      )

For both these, you want two indexes: ips_fatura(ips_usuario_id) and ips_fatura(ips_usuario_id_titular). You can check the explain to be sure that EXISTS is using the index. If not, the newer releases of MySQL use indexes for IN:

SELECT COUNT(*)
FROM ips_usuario u  
WHERE u.id IN (SELECT f.ips_usuario_id FROM ips_fatura f) OR
      u.ips_usuario_id_titular IN (SELECT f.ips_usuario_id FROM ips_fatura f);

In either case (EXISTS or IN) the goal is to do a "semi-join". That is, to only fine the first row with a match rather than all matches. This is an important efficiency, because it allows the query to avoid duplication removal.

I would speculate that the issue is the optimization of the or -- usually this results in inefficient JOIN algorithms. However, perhaps MySQL is smart in your first case. But the addition of the IS NULL to the outer table throws it off.

查看更多
登录 后发表回答