I have a 1 GB mysql table with three colums (german bigrams):
create table sortedindex (source varchar(60),target varchar(60),score float)
engine=myisam character set utf8 collate utf8_bin;
I also created a composite index:
create index sortedstd_ix on sortedindex (source(60), target(60), score);
additionally I compressed the table and made it read only and sorted the index using:
myisamchk --keys-used=0 -rq sortedindex
myisampack sortedindex
myisamchk -rq sortedindex --sort_buffer=3G --sort-index --sort-records=1
now I ask queries with the following structure:
- fix a source
- specify a prefix for the target
- retrieve top k rows by the score
like the following:
select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
mysql explain tells me that is still uses a filesort!
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)`
I understand that if I change the query to:
explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
there will be no file sort but WRONG there is a filesort involved.
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)
from this discussion i realize the desc keyword is the problem. so we check without:
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
perfect that worked.
BUT I want a descending sorting on the score and not the target. Creating the index in this way
create index sortedstd_ix on sortedindex (source(60), score desc, target(60));
is not an option since the target filter will yield for a file sort then or if not the result list of elements that need to be traversed can be really long if the prefix is long and source is a common word.
I somehow have the feeling there is no obvious solution to this?