How to avoid a filesort on a mysql composite index

2019-08-04 07:02发布

问题:

I have a 1 GB mysql table with three colums (german bigrams):

create table sortedindex (source varchar(60),target varchar(60),score float)
engine=myisam character set utf8 collate utf8_bin;

I also created a composite index:

create index sortedstd_ix on sortedindex (source(60), target(60), score);

additionally I compressed the table and made it read only and sorted the index using:

myisamchk --keys-used=0 -rq sortedindex
myisampack sortedindex
myisamchk -rq sortedindex --sort_buffer=3G --sort-index --sort-records=1

now I ask queries with the following structure:

  • fix a source
  • specify a prefix for the target
  • retrieve top k rows by the score

like the following:

select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;

mysql explain tells me that is still uses a filesort!

mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table       | type  | possible_keys | key          | key_len | ref  | rows | Extra                                    |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
|  1 | SIMPLE      | sortedindex | range | sortedstd_ix  | sortedstd_ix | 366     | NULL |   17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)`

I understand that if I change the query to:

explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;

there will be no file sort but WRONG there is a filesort involved.

mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table       | type  | possible_keys | key          | key_len | ref  | rows | Extra                                    |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
|  1 | SIMPLE      | sortedindex | range | sortedstd_ix  | sortedstd_ix | 366     | NULL |   17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)

from this discussion i realize the desc keyword is the problem. so we check without:

mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table       | type  | possible_keys | key          | key_len | ref  | rows | Extra                    |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | sortedindex | range | sortedstd_ix  | sortedstd_ix | 366     | NULL |   17 | Using where; Using index |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

perfect that worked.

BUT I want a descending sorting on the score and not the target. Creating the index in this way

create index sortedstd_ix on sortedindex (source(60), score desc, target(60));

is not an option since the target filter will yield for a file sort then or if not the result list of elements that need to be traversed can be really long if the prefix is long and source is a common word.

I somehow have the feeling there is no obvious solution to this?

回答1:

You are right. There is no obvious solution to this. Sorting is needed because you are asking for multiple values of target (like "interess%"). Hence, the index will not give you rows that are sorted on score.



回答2:

Try this ::

select * from sortedindex 
where source like "ein" and target like "interess%" 
order by score desc, source, target limit 5;