I have a MySQL table called devicelog with it's PK on id
, but multiple indices on device_id
(INT), field_id
(INT), and unixtime
(BIGINT). They are just the default InnoDB indices.
I'm trying to get the ID next to a certain time, I get WAY different performance with different values and different ORDER BYs. IDs and unixtimes both have a positive association, since they both are increasing in order as more data gets inserted, so it seems like it would be okay to safely omit ordering on unixtime. My table has around 25 million records and performance is extremely vital.
This query is fairly slow (~0.5 seconds): Edit: after using USE INDEX(unixtime)
, I was able to increase performance quite a bit (<0.01 seconds!).
SELECT
id
FROM
devicelog
USE INDEX(unixtime) /* edit: looking at the EXPLAIN, I can use this index and it sped things up a bit */
WHERE
device_id = 26
AND field_id = 64
AND unixtime >= 1397166634707 /* a fairly recent time */
/* with no ORDER BY clause, this query is surprisingly slow */
LIMIT 1
EXPLAIN:
1, SIMPLE, devicelog, index_merge, device_id,field_id,field_id_2,unixtime, field_id,device_id, 8,8, , 6667, Using intersect(field_id,device_id); Using where
This query is extremely fast (<0.01 seconds):
SELECT
id
FROM
devicelog
WHERE
device_id = 26
AND field_id = 64
AND unixtime >= 1397166634707 /* a fairly recent time */
ORDER BY unixtime ASC /* <- using unixtime to order */
LIMIT 1
EXPLAIN:
1, SIMPLE, devicelog, range, device_id,field_id,field_id_2,unixtime, unixtime, 9, , 897776, Using index condition; Using where
How would omitting an ORDER BY decrease performance? It seems logical to think that it would increase speed.
Yet, if I change the unixtime to something far back, to "1", it will completely slow down when I use the ORDER BY unixtime. I believe the unixtime index is ordered ascendingly, so this doesn't make much sense either.
This query performs in an opposite manner as the queries above.
Extremely fast (<0.01 seconds):
SELECT
id
FROM
devicelog
WHERE
device_id = 26
AND field_id = 64
AND unixtime >= 1 /* a long time ago */
LIMIT 1
EXPLAIN:
1, SIMPLE, devicelog, index_merge, device_id,field_id,field_id_2,unixtime, field_id,device_id, 8,8, , 6742, Using intersect(field_id,device_id); Using where
This query is the exact same as the fast one, except it's using an older time:
EXTREMELY slow (~7 seconds):
SELECT
id
FROM
devicelog
WHERE
device_id = 26
AND field_id = 64
AND unixtime >= 1 /* a long time ago */
ORDER BY unixtime ASC /* <- using unixtime to order */
LIMIT 1
EXPLAIN:
1, SIMPLE, devicelog, index, device_id,field_id,field_id_2,unixtime, unixtime, 9, , 3504, Using where
Does anyone have any insight on the vast performance differences?
It's hard to make clear suggestions about performance without knowing stuff like the number of rows in your table, and the exact structure of the table.
You might try a compound covering index on
(unixtime, device_id, file_id, id)
. (Look up covering index if you don't know that term).This will allow the
unixtime
part of your query to be satisfied with BTREE lookup, then the rest of your query can be satisfied with an index scan.If you specify
ORDER BY unixtime ASC LIMIT 1
you're telling the query engine to stop scanning that index (which is ordered byunixtime
as soon as it gets a single hit.I don't know why it sometimes keeps going on the scan for seven seconds when you omit the ORDER BY. It's possible it has to hunt for the matching
device_id
andfile_id
values.I think it's documented behavior of LIMIT optimization, see http://dev.mysql.com/doc/refman/5.5/en/limit-optimization.html
Because you're trying to get the ID next to a certain date, I would think ordering the result very vital, because else you can get an arbitrary value. Else you've got to use MIN(id) with your conditions to get the desired id value.