我有一个 1 GB 的 mysql 表,其中包含三个列(德国二元组):
create table sortedindex (source varchar(60),target varchar(60),score float)
engine=myisam character set utf8 collate utf8_bin;
我还创建了一个复合索引:
create index sortedstd_ix on sortedindex (source(60), target(60), score);
另外我压缩了表格并使其只读并使用以下方法对索引进行排序:
myisamchk --keys-used=0 -rq sortedindex
myisampack sortedindex
myisamchk -rq sortedindex --sort_buffer=3G --sort-index --sort-records=1
现在我询问具有以下结构的查询:
- 修复源
- 为目标指定前缀
- 按分数检索前 k 行
如下所示:
select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
mysql explain 告诉我仍然使用文件排序!
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)`
我明白,如果我将查询更改为:
explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
将没有文件排序,但错误的是涉及文件排序。
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score desc limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index; Using filesort |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+------------------------------------------+
1 row in set (0.00 sec)
从这个讨论中我意识到 desc 关键字是问题所在。所以我们不检查:
mysql> explain select * from sortedindex where source like "ein" and target like "interess%" order by source, target, score limit 5;
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
| 1 | SIMPLE | sortedindex | range | sortedstd_ix | sortedstd_ix | 366 | NULL | 17 | Using where; Using index |
+----+-------------+-------------+-------+---------------+--------------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
完美的工作。
但我想要对分数而不是目标进行降序排序。以这种方式创建索引
create index sortedstd_ix on sortedindex (source(60), score desc, target(60));
不是一个选项,因为目标过滤器将为文件排序产生然后或者如果不是,则如果前缀很长并且源是一个常用词,则需要遍历的元素的结果列表可能非常长。
我不知何故感觉没有明显的解决方案?