0

我有一个很大的 InnoDB 表,此时它包含大约 2000 万行,每天插入约 20000 行新行。它们包含不同主题的消息。

CREATE TABLE IF NOT EXISTS `Messages` (
  `ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `TopicID` bigint(20) unsigned NOT NULL,
  `DATESTAMP` int(11) DEFAULT NULL,
  `TIMESTAMP` int(10) unsigned NOT NULL,
  `Message` mediumtext NOT NULL,
  `Checksum` varchar(50) DEFAULT NULL,
  `Nickname` varchar(80) NOT NULL,
  PRIMARY KEY (`ID`),
  UNIQUE KEY `TopicID` (`TopicID`,`Checksum`),
  KEY `DATESTAMP` (`DATESTAMP`),
  KEY `Nickname` (`Nickname`),
  KEY `TIMESTAMP` (`TIMESTAMP`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=25195126 ;

注意:校验和存储一个 MD5 校验和,防止在相同主题中插入两次相同的消息。(昵称 + 时间戳 + topicid + 消息的最后 20 个字符)

我正在构建的站点有一个新闻源,用户可以在其中选择查看来自不同论坛的不同昵称的最新消息。查询如下:

SELECT
Messages.ID AS MessageID,
Messages.Message,
Messages.TIMESTAMP,
Messages.Nickname,
Topics.ID AS TopicID,
Topics.Title AS TopicTitle,
Forums.Title AS ForumTitle

FROM Messages   

JOIN FollowedNicknames ON FollowedNicknames.UserID = 'MYUSERID'
JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
JOIN Subforums ON Subforums.ForumID = Forums.ID
JOIN Topics ON Topics.SubforumID = Subforums.ID

WHERE 

Messages.Nickname = FollowedNicknames.Nickname AND 
Messages.TopicID = Topics.ID AND Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC

TIMESTAMP 包含一个 unix 时间戳,而 DATESTAMP 只是一个从 unix 时间戳生成的日期,以便通过 '=' 运算符而不是使用 unix 时间戳进行范围扫描来更快地访问。

问题是,这个查询大约需要 13 秒(或更多)无缓冲。对于有意使用而言,这当然是不可接受的。添加 DATESTAMP 似乎加快了速度,但速度并不快。

在这一点上,我真的不知道我该怎么办。我已经阅读了有关复合主键的信息,但我仍然不确定它们是否会产生任何好处以及如何在这种特殊情况下正确实现一个。

我知道使用 BIGINT 可能有点矫枉过正,但它们影响那么大吗?

解释:

+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
| id | select_type | table                 | type   | possible_keys                         | key        | key_len | ref                                           | rows | Extra                                        |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | FollowedNicknames     | ALL    | UserID,ForumID,Nickname               | NULL       | NULL    | NULL                                          |    8 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | Forums                | eq_ref | PRIMARY                               | PRIMARY    | 8       | database.FollowedNicknames.ForumiID           |    1 | NULL                                         |
|  1 | SIMPLE      | Messages              | ref    | TopicID,DATETIME,Nickname             | Nickname   | 242     | database.FollowedNicknames.Nickname           |   15 | Using where                                  |
|  1 | SIMPLE      | Topics                | eq_ref | PRIMARY,SubforumID                    | PRIMARY    | 8       | database.Messages.TopicID                     |    1 | NULL                                         |
|  1 | SIMPLE      | Subforums             | eq_ref | PRIMARY,ForumID                       | PRIMARY    | 8       | database.Topics.SubforumID                    |    1 | Using where                                  |
+----+-------------+-----------------------+--------+---------------------------------------+------------+---------+-----------------------------------------------+------+----------------------------------------------+
4

1 回答 1

0

你不应该JOINVARCHAR列 ( Nickname); 您应该使用用户 ID 来加入这些表。这肯定会减慢查询速度,并且可能是最大的问题。如果您将所有的 s 都明确地写在子句JOIN的末尾而不是像这样写在末尾,那么也更容易理解:WHERE

SELECT
    Messages.ID AS MessageID,
    Messages.Message,
    Messages.TIMESTAMP,
    Messages.Nickname,
    Topics.ID AS TopicID,
    Topics.Title AS TopicTitle,
    Forums.Title AS ForumTitle
FROM Messages   
    JOIN FollowedNicknames ON Messages.Nickname = FollowedNicknames.Nickname
        AND FollowedNicknames.UserID = 'MYUSERID'
    JOIN Forums ON Forums.ID = FollowedNicknames.ForumID
    JOIN Subforums ON Subforums.ForumID = Forums.ID
    JOIN Topics ON Messages.TopicID = Topics.ID
        AND Topics.SubforumID = Subforums.ID
WHERE Messages.DATESTAMP = '2013619'
ORDER BY Messages.TIMESTAMP DESC

而不是INT作为DATESTAMP列的数据类型,我会使用DATE. 该Checksum列可能应该latin1_general_ci用作排序规则。我会使用INTID 列,只要它们的值小于 2,000,000,000,因为INT UNSIGNED可以存储大约 4,000,000,000 的值。InnoDB 比 MyISAM 更受主键的影响,它可能会产生明显的差异。

于 2013-06-19T15:38:20.927 回答