1

我在使用错误(低效)索引的 MySQL 查询中遇到问题。

桌子:

mysql> describe ADDRESS_BOOK;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| ADD_BOOK_ID   | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| COMPANY_ID    | bigint(20)   | NO   | MUL | NULL    |                |
| ADDRESS_NAME  | varchar(150) | NO   | MUL | NULL    |                |
| CLEAN_NAME    | varchar(150) | NO   | MUL | NULL    |                |
| ADDRESS_KEY_1 | varchar(150) | NO   | MUL | NULL    |                |
| ADDRESS_KEY_2 | varchar(150) | NO   | MUL | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+

CLEAN_NAME 是普通 ADDRESS_NAME 的“已清理”版本,其中除 [a-zA-Z] 之外的所有内容都已删除,ADDRESS_KEY1 和 ADDRESS_KEY2 是 ADDRESS_NAME 中最长的两个单词,除 [a-zA-Z] 之外的所有内容都已删除。

这些是我的索引(玩弄它试图找到最好的):

mysql> SHOW INDEX FROM ADDRESS_BOOK;
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name          | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| ADDRESS_BOOK |          0 | PRIMARY           |            1 | ADD_BOOK_ID   | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_BOOK_2 |            1 | COMPANY_ID    | A         |          36 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | IDX_ADDRESS_NAME  |            1 | ADDRESS_NAME  | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            1 | CLEAN_NAME    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            2 | ADDRESS_KEY_1 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            3 | ADDRESS_KEY_2 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            4 | COMPANY_ID    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_2      |            1 | ADDRESS_KEY_2 | A         |       18923 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_CLEAN          |            1 | CLEAN_NAME    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_1      |            1 | ADDRESS_KEY_1 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

现在我的查询是:

select * from ADDRESS_BOOK addressboo0_ 
where (addressboo0_.CLEAN_NAME like concat('trad', '%') 
or addressboo0_.ADDRESS_KEY_1 like concat('trad', '%') 
or addressboo0_.ADDRESS_KEY_2 like concat('trad', '%')) 
and addressboo0_.COMPANY_ID=1 
order by addressboo0_.CLEAN_NAME asc 
limit 200

系统中有来自不同公司的用户,因此查询应该只返回用户所在公司的地址簿条目。

对此的解释是

+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
| id | select_type | table        | type | possible_keys                                                        | key               | key_len | ref   | rows | Extra                       |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
|  1 | SIMPLE      | addressboo0_ | ref  | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_ADDRESS_BOOK_2 | 8       | const | 4108 | Using where; Using filesort |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+

我知道 MySQL 不能在查询或查询上使用多列索引,但正如您所见,它使用的是 COMPANY (FK_ADDRESS_BOOK_2) 的索引,而不是字符串列的任何索引!

如果我将公司从查询中取出,它将使用其他索引:

+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
| id | select_type | table        | type        | possible_keys                                      | key                                | key_len     | ref  | rows | Extra                                                                             |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
|  1 | SIMPLE      | addressboo0_ | index_merge | FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2 | 452,452,452 | NULL | 1089 | Using sort_union(FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2); Using where; Using filesort |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+

如果我对不同的公司使用相同的查询(包括公司),它会突然使用多列索引:

+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table        | type  | possible_keys                                                        | key             | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
|  1 | SIMPLE      | addressboo0_ | index | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FX_ADDRESS_KEYS | 1364    | NULL |  492 | Using where |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+

因此,对于公司 1,它有 266 个结果,而对于公司 16,它有 437 个结果。公司 1 总共有 4109 个条目,而公司 16 有 7745 个条目。

所以我比较困惑。为什么 MySQL 为一家公司使用多列索引 FX_ADDRESS_KEYS 而为另一家公司使用相当低效的 FK_ADDRESS_BOOK_2 (基本上遍历该公司的每一行)。

如何改进查询/索引?如果我删除 ADDRESS_KEY_1 和 ADDRESS_KEY_2 的 or ,它正在使用 FX_ADDRESS_KEYS 索引,但我无法在名称中搜索字符串。如果我使用 '%trade%' 之类的东西,就不能使用索引。

4

1 回答 1

1

如果你想为这个查询制定一个漂亮的解释计划,那么试试这个:

CREATE INDEX FX_ADDRESS_KEYS_XX  ON ADDRESS_BOOK( 
         COMPANY_ID, 
         CLEAN_NAME, 
         ADDRESS_KEY_1, 
         ADDRESS_KEY_2 );

该索引应该会改进查询,但要付出一些代价。
它包含几乎整个表的副本(除了 2 列:ADD_BOOK_ID bigint(20)ADDRESS_NAME varchar(150)) - 这将占用大量磁盘空间。
而且它肯定会减慢插入和更新速度,因为索引数据也必须更新。

于 2013-09-25T22:03:12.070 回答