2

抱歉,这是一个如此具体且可能陈词滥调的问题,但这确实给我带来了重大问题。

每天我必须做几十万个看起来像这两个的选择语句(这是一个例子,但它们几乎都是一样的,只是有不同的word1):

SELECT pibn,COUNT(*) AS aaa FROM research_storage1
USE INDEX (word21pibn)
WHERE word1=270299 AND word2=0
GROUP BY pibn
ORDER BY aaa DESC
LIMIT 1000;

SELECT pibn,page FROM research_storage1
USE INDEX (word12num)
WHERE word1=270299 AND word2=0
ORDER BY num DESC
LIMIT 1000;

第一个语句是快速的,只需要几分之一秒。第二个语句大约需要 2 秒,考虑到我有成千上万的事情要做,这太长了。

索引是:

word21pibn: word2, word1, pibn
word12num: word1, word2, num

解释的结果(对于扩展和分区都是):

mysql> explain extended SELECT pibn,COUNT(*) AS aaa FROM research_storage1 USE INDEX (word21pibn) WHERE word1=270299 AND word2=0 GROUP BY pibn ORDER BY aaa DESC LIMIT 1000;
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
| id | select_type | table             | type | possible_keys | key        | key_len | ref         | rows | filtered | Extra                                                     |
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | research_storage1 | ref  | word21pibn    | word21pibn | 6       | const,const | 1549 |   100.00 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------------------+------+---------------+------------+---------+-------------+------+----------+-----------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain partitions SELECT pibn,COUNT(*) AS aaa FROM research_storage1 USE INDEX (word21pibn) WHERE word1=270299 AND word2=0 GROUP BY pibn ORDER BY aaa DESC LIMIT 1000;
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
| id | select_type | table             | partitions | type | possible_keys | key        | key_len | ref         | rows | Extra                                                     |
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
|  1 | SIMPLE      | research_storage1 | p99        | ref  | word21pibn    | word21pibn | 6       | const,const | 1549 | Using where; Using index; Using temporary; Using filesort |
+----+-------------+-------------------+------------+------+---------------+------------+---------+-------------+------+-----------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain extended SELECT pibn,page FROM research_storage1 USE INDEX (word12num) WHERE word1=270299 AND word2=0 ORDER BY num DESC LIMIT 1000;
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
| id | select_type | table             | type | possible_keys | key       | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | research_storage1 | ref  | word12num     | word12num | 6       | const,const |  818 |   100.00 | Using where |
+----+-------------+-------------------+------+---------------+-----------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain partitions SELECT pibn,page FROM research_storage1 USE INDEX (word12num) WHERE word1=270299 AND word2=0 ORDER BY num DESC LIMIT 1000;
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
| id | select_type | table             | partitions | type | possible_keys | key       | key_len | ref         | rows | Extra       |
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
|  1 | SIMPLE      | research_storage1 | p99        | ref  | word12num     | word12num | 6       | const,const |  818 | Using where |
+----+-------------+-------------------+------------+------+---------------+-----------+---------+-------------+------+-------------+
1 row in set (0.00 sec)

我看到的唯一区别是第二条语句Using index在描述的额外列中没有。虽然这没有意义,因为索引是为该语句设计的,所以我不明白为什么不使用它。

任何想法?

4

1 回答 1

3

尝试将pbinandpage列添加到word12num复合索引。然后,您查询所需的所有信息都将在索引中,就像在您的第一个查询中一样。

编辑我错过了pbin您选择的列;对于那个很抱歉。

如果您的复合索引结果包含,(word1, word2, num, pbin, page)那么您的第二个查询中的所有内容都可以来自索引。

如果您查看Extra第一个查询下的列EXPLAIN,其中的一个简介是Using index. @sebas 指出的。这实际上意味着,Using index only.这意味着服务器只需查询索引即可满足您的查询,而无需查询表。这就是它如此之快的原因:服务器不必为了获得额外的列而随机访问表来敲击磁盘头。 Using index在您的第二个查询中不存在EXPLAIN.

中提到的列WHERE首先出现。然后我们在ORDER BY. 最后,我们有了您正在阅读的列SELECT。为什么对索引中的列使用这种特定顺序?服务器找到与 匹配的第一个索引条目的方式SELECT,然后可以顺序读取索引以满足查询。

在大表上构建和维护复合索引确实很昂贵。您正在考虑 DBMS 设计中的基本权衡:您想花时间构建表还是在其中查找内容?只有您知道在构建表格时或在其中查找内容时产生成本是否更好。

于 2013-05-29T14:56:40.420 回答