4

我需要帮助来优化此查询。

  SELECT messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

如果没有限制,此查询将返回 200K 行,运行大约需要 1.3 - 2 秒。问题似乎出在 order by 子句中。没有它,查询需要 .0005 seconds 。

Indexes:
    ( subscription.user_id, subscription.entity_id )
    ( subscription.entity_id )
    ( messages.timestamp )
    ( messages.entity_id, messages.timestamp )

通过将查询更改为此,我能够提高性能:

SELECT messages.* FROM messages
INNER JOIN subscription ON subscription.entity_id = messages.entity_id 
INNER JOIN ( 
   SELECT message_id FROM messages ORDER BY timestamp DESC
) as temp on temp.messsage_id = messages.message_id
WHERE subscription.user_id = 1 LIMIT 50

这在 0.12 秒内运行。一个非常好的改进,但我想知道它是否可以更好。似乎如果我能以某种方式过滤第二个内部连接,那么事情会更快。

谢谢。

架构:

   messages 
      message_id, entity_id, message, timestamp

   subscription
      user_id, entity_id

更新

Raymond Nijland 的答案解决了我最初的问题,但又出现了另一个问题

 SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

直接连接在两种情况下效率低下:

  1. 订阅表中没有 user_id 条目

  2. 消息表中的相关条目很少

对于如何解决这个问题,有任何的建议吗?如果不是从查询的角度来看,一个应用程序?

更新

解释信息

限制 50

| id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    | rows | Extra       |
|  1 | SIMPLE      | messages          | index  | idx_timestamp                           | idx_timestamp | 4       | NULL                                   |   50 |             |
|  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |    1 | Using index |

没有限制

| id | select_type | table             | type   | possible_keys                           | key           | key_len | ref                                    |   rows   | Extra         |
|  1 | SIMPLE      | messages          | ALL    | entity_id_2,entity_id                   | NULL          | NULL    | NUL                                    |   255069 | Using filesort|
|  1 | SIMPLE      | subscription      | eq_ref | PRIMARY,entity_id,user_id               | PRIMARY       | 16      | const, messages.entity_id              |        1 | Using index   |

创建表语句:

约 5000 行

subscription | CREATE TABLE `subscription` (
  `user_id`   bigint(20) unsigned NOT NULL,
  `entity_id` bigint(20) unsigned NOT NULL,
  PRIMARY KEY (`user_id`,`entity_id`),
  KEY `entity_id` (`entity_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

约 255,000 行

messages | CREATE TABLE `messages` (
  `message_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `entity_id` bigint(20) unsigned NOT NULL,
  `message` varchar(255) NOT NULL DEFAULT '',
  `timestamp` int(10) unsigned NOT NULL,
  PRIMARY KEY (`message_id`),
  KEY `entity_id` (`entity_id`,`timestamp`),
  KEY `idx_timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 
4

1 回答 1

3

删除索引messages.entity_id,这个是冗余的并尝试直接加入我认为mysql优化器以错误的顺序访问你的表。MySQL 需要首先访问表消息,以便它可以使用消息上的索引(实体 ID,时间戳)并消除对“使用临时;使用文件排序”的需要(如果 MySQL 需要创建基于 MyISAM 磁盘的表并且需要对磁盘 I/O 读取和 I/O 写入进行排序(快速排序算法)。

 SELECT STRAIGHT_JOIN messages.*
   FROM messages
   INNER JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

或者

 SELECT messages.*
   FROM messages
   STRAIGHT_JOIN subscription ON subscription.entity_id = messages.entity_id
   WHERE subscription.user_id = 1
   ORDER BY messages.timestamp DESC 
   LIMIT 50

我也遇到过这个问题,我像这样http://sqlfiddle.com/#!2/b34870/1修复了它,但随后使用国家/城市表

编辑因为关闭 Jason M 对 STRAIGHT_JOIN 的反应

直接连接在两种情况下效率低下:

there is no user_id entry in the subscription table

Indeed the MySQL optimizer with INNER JOIN would trigger an "Impossible WHERE noticed after reading const tables" and never executes the query. But an STRAIGHT_JOIN doens't trigger an "Impossible WHERE noticed after reading const tables" so an (maybe full) index scan needs to be done to find it's user_id value that could slow down query execution. Easy fix would be: use existing user_id's with the STRAIGHT_JOIN

there are few relevant entries in the messages table

Possible same problem here MySQL thinks it should do an (maybe full) index scan to find results. but i need to see an EXPLAIN statement to know for sure

You may also want to try this query first

SELECT 
 *
FROM (

 SELECT
  entity_id

 FROM
  subscriptions

 WHERE
  subscription.user_id = 1 
)
 subscriptions

INNER JOIN 
 messages

ON
 subscriptions.entity_id = messages.entity_id

ORDER BY
 messages.timestamp DESC

LIMIT 50  
于 2013-10-05T17:42:55.260 回答