I have a table foo
with (among 20 others) columns bar
, baz
and quux
with indexes on baz
and quux
. The table has ~500k rows.
Why do the following to queries differ so much in speed? Query A takes 0.3s, while query B takes 28s.
Query A
select baz from foo
where bar = :bar
and quux = (select quux from foo where bar = :bar order by quux desc limit 1)
Explain
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY foo ref quuxIdx quuxIdx 9 const 2 "Using where"
2 SUBQUERY foo index NULL quuxIdx 9 NULL 1 "Using where"
Query B
select baz from foo
where bar = :bar
and quux = (select MAX(quux) from foo where bar = :bar)
Explain
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY foo ref quuxIdx quuxIdx 9 const 2 "Using where"
2 SUBQUERY foo ALL NULL NULL NULL NULL 448060 "Using where"
I use MySQL 5.1.34.
You should add an index on (bar, quux)
.
Without this index, MySQL can't see how to perform the query efficiently so it has to choose from various inefficient query plans.
In the first example it scans the quux
index and for each row found, looks up the corresponding value of bar
in the original table. This takes twice as long to check each row, but it gets lucky that a row that has the correct value of bar
is near the beginning of its scan and so it can stop. This could be because the value of bar
you are searching for occurs frequently, so the chance of being lucky is very high. As a result it may only have to examine a handful of rows before it finds a match, so even though it takes twice as long to check each row, the fact that only a few rows are checked gives a massive overall saving. Since you have no index on bar
, MySQL doesn't know in advance that the value :bar
occurs frequently so it cannot know that this query will be fast.
In the second example it uses a different plan where it always scans the whole table. Each row is read directly from the table, without usage of an index. This means that each row read is fast, but because you have a lot of rows, it is slow overall. If none of the rows matched on :bar
this would be the faster query plan. But if roughly 1% of rows have the desired value of bar
, it will be (very) approximately 100 times slower to use this query plan compared to the above plan. Since you have no index on bar
, MySQL doesn't know this in advance.
You could also just add the missing index and then both queries will go much faster.