I have the following self-join query:
SELECT A.id
FROM mytbl AS A
LEFT JOIN mytbl AS B
ON (A.lft BETWEEN B.lft AND B.rgt)
The query is quite slow, and after looking at the execution plan the cause appears to be a full table scan in the JOIN. The table has only 500 rows, and suspecting this to be the issue I increased it to 100,000 rows in order to see if it made a difference to the optimizer's selection. It did not, with 100k rows it was still doing a full table scan.
My next step was to try and force indexes with the following query, but the same situation arises, a full table scan:
SELECT A.id
FROM categories_nested_set AS A
LEFT JOIN categories_nested_set AS B
FORCE INDEX (idx_lft, idx_rgt)
ON (A.lft BETWEEN B.lft AND B.rgt)
All columns (id, lft, rgt) are integers, all are indexed.
Why is MySql doing a full table scan here?
How can I change my query to use indexes instead of a full table scan?
CREATE TABLE mytbl ( lft int(11) NOT NULL DEFAULT '0',
rgt int(11) DEFAULT NULL,
id int(11) DEFAULT NULL,
category varchar(128) DEFAULT NULL,
PRIMARY KEY (lft),
UNIQUE KEY id (id),
UNIQUE KEY rgt (rgt),
KEY idx_lft (lft),
KEY idx_rgt (rgt) ) ENGINE=InnoDB DEFAULT CHARSET=utf8
Thanks
You have lot's of indexes, some of them are redundant. Let's begin by clearing up some of them. Too many indexes slows down inserts and updates.
Since you already have a primary key defined on lft, there is no need what so ever for another index on lft. Similarly with a unique index on rgt there is not need for the second index listed below.
Now let's look at your query.
This is very unlikely to be a query that will be encountered in the wild. With 500 rows, this query may produce even 5000 rows? Do you really need the entire key created in one go? The reason that this query is slow is because mysql can only optimize range comparisions for constants. It's more likely that your actually query will look something like this:
Where you create the node for a particular id. This will use indexes and will be really fast. What's the point of optimizing for a query that you will not use much if at all?
The following SO question is critical to the solution, as there is very little information on the combination of adjacency lists and indices:
MySQL & nested set: slow JOIN (not using index)
It appears that adding a basic comparison condition triggers the use of an index, like so:
And no more table scans.
EDIT: Comparison of EXPLAIN function between fixed and unfixed version of the query: