8

我正在使用来自亚马逊的 RDS 数据库和一些非常大的表,昨天我开始面临服务器上 100% 的 CPU 利用率和一堆以前没有发生过的慢查询日志。

我试图检查正在运行的查询并从解释命令中遇到这个结果

+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                         | type   | possible_keys                                                                                | key                                   | key_len | ref                                                             | rows | Extra                                        |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | businesses                    | const  | PRIMARY                                                                                      | PRIMARY                               | 4       | const                                                           |    1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | activities_businesses         | ref    | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9       | const                                                           | 2252 | Using index condition; Using where           |
|  1 | SIMPLE      | activities_b_taggings_975e9c4 | ref    | taggings_idx                                                                                 | taggings_idx                          | 782     | const,myapp_production.activities_businesses.id,const           |    1 | Using index condition; Using where           |
|  1 | SIMPLE      | activities                    | eq_ref | PRIMARY,index_activities_on_created_at                                                       | PRIMARY                               | 8       | myapp_production.activities_businesses.activity_id              |    1 | Using where                                  |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+

同时签入进程列表,我得到了这样的信息:

+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User            | Host                                | db                         | Command | Time | State        | Info                                                                                                 |
+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
|  1 | my_app          | my_ip:57152                         | my_app_production          | Sleep   |    0 |              | NULL                                                                                                 |
|  2 | my_app          | my_ip:57153                         | my_app_production          | Sleep   |    2 |              | NULL                                                                                                 |
|  3 | rdsadmin        | localhost:49441                     | NULL                       | Sleep   |    9 |              | NULL                                                                                                 |
|  6 | my_app          | my_other_ip:47802                   | my_app_production          | Sleep   |  242 |              | NULL                                                                                                 |
|  7 | my_app          | my_other_ip:47807                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
|  8 | my_app          | my_other_ip:47809                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
|  9 | my_app          | my_other_ip:47810                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
| 10 | my_app          | my_other_ip:47811                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
| 11 | my_app          | my_other_ip:47813                   | my_app_production          | Query   |  231 | Sending data | SELECT my_fields...                                                                                  |
...

因此,根据这些数字,似乎没有理由进行缓慢的查询,因为最糟糕的执行计划是经过 2k 行的执行计划,这并不多。

编辑 1

另一个可能有用的信息是慢查询日志

SET timestamp=1401457485;
SELECT my_query...
# User@Host: myapp[myapp] @ ip-10-195-55-233.ec2.internal [IP] Id: 435
# Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387

编辑 2

分析后,我得到了这个结果。结果大约有 250 行,每行有两列。

+----------------------+----------+
| state                | duration |
+----------------------+----------+
| Sending data         |      272 |
| removing tmp table   |        0 |
| optimizing           |        0 |
| Creating sort index  |        0 |
| init                 |        0 |
| cleaning up          |        0 |
| executing            |        0 |
| checking permissions |        0 |
| freeing items        |        0 |
| Creating tmp table   |        0 |
| query end            |        0 |
| statistics           |        0 |
| end                  |        0 |
| System lock          |        0 |
| Opening tables       |        0 |
| logging slow query   |        0 |
| Sorting result       |        0 |
| starting             |        0 |
| closing tables       |        0 |
| preparing            |        0 |
+----------------------+----------+

编辑 3

根据要求添加查询

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities_businesses` 
       INNER JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       INNER JOIN `activities` 
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 
ORDER  BY activities.created_at; 

编辑 4

由于 taggable_id 列上的 taggings 和 activity_businesses 之间的列类型不同,可能没有应用索引。

mysql> SHOW COLUMNS FROM activities_businesses;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | int(11)    | NO   | PRI | NULL    | auto_increment |
| activity_id | bigint(20) | YES  | MUL | NULL    |                |
| business_id | bigint(20) | YES  | MUL | NULL    |                |
+-------------+------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)

mysql> SHOW COLUMNS FROM taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |
| tag_id        | int(11)      | YES  | MUL | NULL    |                |
| taggable_id   | bigint(20)   | YES  |     | NULL    |                |
| taggable_type | varchar(255) | YES  |     | NULL    |                |
| tagger_id     | int(11)      | YES  |     | NULL    |                |
| tagger_type   | varchar(255) | YES  |     | NULL    |                |
| context       | varchar(128) | YES  |     | NULL    |                |
| created_at    | datetime     | YES  |     | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+

因此,它检查的行数比解释查询中显示的行数多,可能是因为未应用某些索引。

你们可以帮助我吗?

4

2 回答 2

0

正如我在第 4 次编辑中所期望的那样,MySQL 通过该DESCRIBE命令提供了误导性信息。

即使命令提供的执行计划告诉它如下:

+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                         | type   | possible_keys                                                                                | key                                   | key_len | ref                                                             | rows | Extra                                        |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | businesses                    | const  | PRIMARY                                                                                      | PRIMARY                               | 4       | const                                                           |    1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | activities_businesses         | ref    | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9       | const                                                           | 2252 | Using index condition; Using where           |
|  1 | SIMPLE      | activities_b_taggings_975e9c4 | ref    | taggings_idx                                                                                 | taggings_idx                          | 782     | const,myapp_production.activities_businesses.id,const           |    1 | Using index condition; Using where           |
|  1 | SIMPLE      | activities                    | eq_ref | PRIMARY,index_activities_on_created_at                                                       | PRIMARY                               | 8       | myapp_production.activities_businesses.activity_id              |    1 | Using where                                  |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+

显然这不是正在发生的事情,因为在慢速查询中我们可以看到检查的行数要多得多。

# Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387

有了这些信息,就可以验证连接并发现表中的 id 没有使用相同的类型:

mysql> SHOW COLUMNS FROM activities_businesses;
+-------------+------------+------+-----+---------+----------------+
| Field       | Type       | Null | Key | Default | Extra          |
+-------------+------------+------+-----+---------+----------------+
| id          | int(11)    | NO   | PRI | NULL    | auto_increment |
3 rows in set (0.01 sec)

mysql> SHOW COLUMNS FROM taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |

稍微研究一下这个主题,很容易说 MySQL 不会索引不同类型的列,或者至少不正确,即使EXPLAIN命令会这么说。

加入不同类型的列?

使用不同数字类型的列的 JOIN 性能

因此,最后在进行更改以将两列转换为相同类型之后,查询开始执行优于 50 毫秒,这对于我的场景来说已经足够了。

于 2014-05-31T12:57:07.730 回答
-1

我怀疑 order by 子句是罪魁祸首。尝试几件事。向 created_at 列添加索引。如果可行,那就太好了,不要再读了!否则,(或者如果它有效并且您想要更快的查询)运行这个查询,看看它是否需要更长的时间来执行:

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities`  
       JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       JOIN `activities_businesses`
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 
ORDER  BY activities.created_at; 

如果这能解决它,太好了!

否则,(或者如果之前的建议有效并且您想要更快更快的查询)执行以下两步过程: 1. 运行相同的查询而不使用 order by 子句:

SELECT activities.share_count, 
       activities.created_at 
FROM   `activities`  
       JOIN `businesses` 
               ON `businesses`.`id` = `activities_businesses`.`business_id` 
       JOIN `activities_businesses`
               ON `activities`.`id` = `activities_businesses`.`activity_id` 
       JOIN taggings activities_b_taggings_975e9c4 
         ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
            AND activities_b_taggings_975e9c4.taggable_type = 
                'ActivitiesBusiness' 
            AND activities_b_taggings_975e9c4.tag_id = 104 
            AND activities_b_taggings_975e9c4.created_at >= 
                '2014-04-30 13:36:44' 
WHERE  ( businesses.id = 1 ) 
       AND ( activities.created_at > '2014-04-30 13:36:44' ) 
       AND ( activities.created_at < '2014-05-30 12:27:03' ) 

如果它执行得超级快,那就太好了。将结果转储到临时表中,然后对其进行排序。这将您的昂贵操作分解为更易于服务器执行的较小操作,即

CREATE TEMPORARY TABLE temp_activities (INDEX(created_at))
SELECT activities.share_count, 
           activities.created_at 
    FROM   `activities`  
           JOIN `businesses` 
                   ON `businesses`.`id` = `activities_businesses`.`business_id` 
           JOIN `activities_businesses`
                   ON `activities`.`id` = `activities_businesses`.`activity_id` 
           JOIN taggings activities_b_taggings_975e9c4 
             ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id 
                AND activities_b_taggings_975e9c4.taggable_type = 
                    'ActivitiesBusiness' 
                AND activities_b_taggings_975e9c4.tag_id = 104 
                AND activities_b_taggings_975e9c4.created_at >= 
                    '2014-04-30 13:36:44' 
    WHERE  ( businesses.id = 1 ) 
           AND ( activities.created_at > '2014-04-30 13:36:44' ) 
           AND ( activities.created_at < '2014-05-30 12:27:03' );
SELECT * FROM temp_activities ORDER BY created_at;
于 2014-05-31T00:58:58.277 回答