4

Mysql 上的大表存在一些性能问题:该表有 3800 万行,其大小为 3GB。我想通过测试 2 列来选择:我尝试了许多索引(每列一个索引,一个 2 列索引)但我的查询仍然很慢:如下所示,获取 1644 行需要超过 4 秒:

SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` WHERE (`twstats_twwordstrend`.`word_id` = 1001 AND `twstats_twwordstrend`.`created` > '2011-11-07 14:01:34' );
...
...
...
1644 rows in set (4.66 sec)

EXPLAIN SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` WHERE (`twstats_twwordstrend`.`word_id` = 1001 AND `twstats_twwordstrend`.`created` > '2011-11-07 14:01:34' );
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
| id | select_type | table                | type  | possible_keys                                       | key                   | key_len | ref  | rows | Extra       |
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
|  1 | SIMPLE      | twstats_twwordstrend | range | twstats_twwordstrend_4b95d890,word_id_created_index | word_id_created_index | 12      | NULL | 1643 | Using where |
+----+-------------+----------------------+-------+-----------------------------------------------------+-----------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> describe twstats_twwordstrend;
+---------+----------+------+-----+---------+----------------+
| Field   | Type     | Null | Key | Default | Extra          |
+---------+----------+------+-----+---------+----------------+
| id      | int(11)  | NO   | PRI | NULL    | auto_increment |
| created | datetime | NO   |     | NULL    |                |
| freq    | double   | NO   |     | NULL    |                |
| word_id | int(11)  | NO   | MUL | NULL    |                |
+---------+----------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> show index from twstats_twwordstrend;
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                | Non_unique | Key_name                      | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| twstats_twwordstrend |          0 | PRIMARY                       |            1 | id          | A         |    38676897 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | twstats_twwordstrend_4b95d890 |            1 | word_id     | A         |      655540 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | word_id_created_index         |            1 | word_id     | A         |      257845 |     NULL | NULL   |      | BTREE      |         |               |
| twstats_twwordstrend |          1 | word_id_created_index         |            2 | created     | A         |    38676897 |     NULL | NULL   |      | BTREE      |         |               |
+----------------------+------------+-------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
4 rows in set (0.03 sec)

我还发现在表中只取一行很慢:

mysql> SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` limit 10000000,1;
+----------+---------------------+--------------------+---------+
| id       | created             | freq               | word_id |
+----------+---------------------+--------------------+---------+
| 10000001 | 2011-09-09 15:59:18 | 0.0013398539559188 |   41295 |
+----------+---------------------+--------------------+---------+
1 row in set (1.73 sec)

...并且在表格的开头并不慢:

mysql> SELECT `twstats_twwordstrend`.`id`, `twstats_twwordstrend`.`created`, `twstats_twwordstrend`.`freq`, `twstats_twwordstrend`.`word_id` FROM `twstats_twwordstrend` limit 1,1;
+----+---------------------+---------------------+---------+
| id | created             | freq                | word_id |
+----+---------------------+---------------------+---------+
|  2 | 2011-06-16 10:59:06 | 0.00237777777777778 |       2 |
+----+---------------------+---------------------+---------+
1 row in set (0.00 sec)

该表使用 Innodb 引擎。如何加快对大表的查询?

4

2 回答 2

5

您可以做的主要事情是添加索引。

每当您在 where 子句中使用列时,请确保它具有索引。您创建的列上没有一个。

包括 created 列的多索引本质上不是 created 索引,因为 created 不是多索引中的第一个。

使用多索引时,您几乎应该总是将具有较高基数的列放在首位。(created, word_id)因此,将索引设置为:(word_id)会给您带来显着的提升。

于 2012-05-07T15:37:50.957 回答
3

查询LIMIT 10000000,1总是很慢,因为它需要获取超过 1000 万行(它忽略除最后一行之外的所有行)。如果您的应用程序需要定期进行此类查询,请考虑重新设计。

表格没有“开始”和“结束”;它们本质上不是有序的。

在我看来,您需要在 ( word_id, created) 上建立索引。

您绝对应该在具有生产级硬件的非生产服务器上对此进行性能测试。

顺便说一句,现在 3Gb 数据库并不太大,它可以容纳除最小服务器之外的所有服务器上的 RAM(您正在运行 64 位操作系统,对,并且已经适当地调整了 innodb_buffer_pool?或者您的系统管理员做了?)。

于 2012-05-07T15:55:48.520 回答