0

我有三张桌子。

一张表包含大约 75,000 行
的提交 一张表包含提交评级并且只有 < 10 行
一张表包含提交 => 竞争映射,对于我的测试数据也有大约 75,000 行。

我想做的是

在一轮比赛中获得前 50 名提交的作品。最高被归类为最高平均评分,其次是最高票数

这是我正在使用的查询,但问题是它需要超过 45 秒才能完成!我分析了查询(结果在底部),瓶颈是将数据复制到 tmp 表然后对其进行排序,那么我该如何加快速度呢?

 SELECT `submission_submissions`.* 
   FROM `submission_submissions`
   JOIN `competition_submissions` 
     ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
LEFT JOIN `submission_ratings` 
     ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
  WHERE `top_round` =  1 
    AND `competition_id` =  '2'
    AND `submission_submissions`.`date_deleted` IS NULL
GROUP BY submission_submissions.id
ORDER BY AVG(submission_ratings.`stars`) DESC, 
         COUNT(submission_ratings.`id`) DESC
  LIMIT 50

提交_提交

CREATE TABLE `submission_submissions` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) NOT NULL,
  `title` varchar(255) NOT NULL,
  `description` varchar(255) DEFAULT NULL,
  `genre` int(11) NOT NULL,
  `goals` text,
  `submission` text NOT NULL,
  `date_created` datetime DEFAULT NULL,
  `date_modified` datetime DEFAULT NULL,
  `date_deleted` datetime DEFAULT NULL,
  `cover_image` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `genre` (`genre`),
  KEY `account_id` (`account_id`),
  KEY `date_created` (`date_created`)
) ENGINE=InnoDB AUTO_INCREMENT=115037 DEFAULT CHARSET=latin1;

提交评分

CREATE TABLE `submission_ratings` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `account_id` int(11) NOT NULL,
  `submission_id` int(11) NOT NULL,
  `stars` tinyint(1) NOT NULL,
  `date_created` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `submission_id` (`submission_id`),
  KEY `account_id` (`account_id`),
  KEY `stars` (`stars`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;

比赛提交

CREATE TABLE `competition_submissions` (
  `competition_id` int(11) NOT NULL,
  `submission_id` int(11) NOT NULL,
  `top_round` int(11) DEFAULT '1',
  PRIMARY KEY (`submission_id`),
  KEY `competition_id` (`competition_id`),
  KEY `top_round` (`top_round`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

SHOW PROFILE 结果(按持续时间排序)

state                 duration (summed) in sec percentage
Copying to tmp table  33.15621                 68.46924
Sorting result        11.83148                 24.43260
removing tmp table     3.06054                  6.32017
Sending data           0.37560                  0.77563
... insignificant amounts removed ...
Total                  48.42497               100.00000

解释

id  select_type  table                    type         possible_keys                     key                       key_len  ref                                              rows   Extra                                                                                                 
1   SIMPLE       competition_submissions  index_merge  PRIMARY,competition_id,top_round  competition_id,top_round  4,5                                                       18596  Using intersect(competition_id,top_round); Using where; Using index; Using temporary; Using filesort  
1   SIMPLE       submission_submissions   eq_ref       PRIMARY                           PRIMARY                   4        inkstakes.competition_submissions.submission_id  1      Using where                                                                                           
1   SIMPLE       submission_ratings       ALL          submission_id                                                                                                         5      Using where; Using join buffer (flat, BNL join)                                                       
4

2 回答 2

1

假设实际上您对未评级的提交不感兴趣,并且给定的提交competition_submissions对于给定的 match 和 top_round 只有一个条目,我建议:

SELECT s.* 
FROM (SELECT `submission_id`, 
             AVG(`stars`) AvgStars, 
             COUNT(`id`) CountId
      FROM `submission_ratings` 
      GROUP BY `submission_id`
      ORDER BY AVG(`stars`) DESC, COUNT(`id`) DESC
      LIMIT 50) r
JOIN `submission_submissions` s
  ON r.`submission_id` = s.`id` AND
     s.`date_deleted` IS NULL
JOIN `competition_submissions` c
  ON c.`submission_id` = s.`id` AND 
     c.`top_round` =  1 AND
     c.`competition_id` = '2'
ORDER BY r.AvgStars DESC, 
         r.CountId DESC

(如果对于给定的 match 和 top_round,每次提交有多个competition_submissions条目,那么您可以将 GROUP BY 子句添加回主查询。)

如果您确实想查看未评级的提交,您可以将此查询的结果联合到 LEFT JOIN ... WHERE NULL 查询。

于 2013-08-11T12:42:43.713 回答
1

有一个适用于 MySql 的简单技巧,有助于避免在这样的查询中复制/排序巨大的临时表(使用 LIMIT X)。

只是避免SELECT *,这会将所有列复制到临时表,然后对这个巨大的表进行排序,最后,查询只从这个巨大的表中获取 50 条记录(50 / 70000 = 0,07 %)。

仅选择真正需要执行排序和限制的列,然后仅按 id 为选定的 50 条记录连接缺失的列。

select ss.*
from submission_submissions ss
join (
            SELECT `submission_submissions`.id, 
                    AVG(submission_ratings.`stars`) stars,
                    COUNT(submission_ratings.`id`) cnt
               FROM `submission_submissions`
               JOIN `competition_submissions` 
                 ON `competition_submissions`.`submission_id` = `submission_submissions`.`id`
            LEFT JOIN `submission_ratings` 
                 ON `submission_submissions`.`id` = `submission_ratings`.`submission_id`
              WHERE `top_round` =  1 
                AND `competition_id` =  '2'
                AND `submission_submissions`.`date_deleted` IS NULL
            GROUP BY submission_submissions.id
            ORDER BY AVG(submission_ratings.`stars`) DESC, 
                     COUNT(submission_ratings.`id`) DESC
              LIMIT 50
) xx
ON ss.id = xx.id
ORDER BY xx.stars DESC, 
         xx.cnt DESC; 
于 2013-08-11T14:53:56.447 回答