mysql - MySql 表性能优化

Question

我有一个具有以下结构的表

CREATE TABLE rel_score (
  user_id bigint(20) NOT NULL DEFAULT '0',
  score_date date NOT NULL,
  rel_score decimal(4,2) DEFAULT NULL,
  doc_count int(8) NOT NULL
  total_doc_count int(8) NOT NULL
  PRIMARY KEY (user_id,score_date),
  KEY SCORE_DT_IDX (score_date)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 PACK_KEYS=1

该表将存储自 2000 年 1 月 1 日至今每天应用程序中每个用户的 rel_score 值。我估计总记录数将超过 7 亿。我用 6 个月的数据（约 3000 万行）填充了该表，查询响应时间约为 8 分钟。这是我的查询，

select 
  user_id, max(rel_score) as max_rel_score
from
  rel_score
where score_date between '2012-01-01' and '2012-06-30'
group by user_id
order by max_rel_score desc;

我尝试使用以下技术优化查询，

在 score_date 列上进行分区
在 score_date 列上添加索引

查询响应时间略微提高到略低于 8 分钟。

如何提高响应时间？桌子的设计合适吗？

此外，我无法将旧数据移动到存档，因为允许用户查询整个数据范围。

score 0 · Accepted Answer

如果您在 score_date 的同一级别上对表进行分区，则不会减少查询响应时间。

尝试创建另一个仅包含日期年份的属性，将其转换为 INTEGER ，在此属性上对表进行分区（您将获得 13 个分区），然后重新执行查询以查看 .

score 0 · Accepted Answer

您的主索引应该很好地覆盖表格。如果你没有它，我建议在rel_score(user_id, score_date, rel_score). 对于您的查询，这是一个“覆盖”索引，这意味着该索引具有查询中的所有列，因此引擎永远不必访问数据页（仅访问索引）。

以下版本也可能会很好地利用此索引（尽管我更喜欢您的查询版本）：

select u.user_id,
       (select max(rel_score)
        from rel_score r2
        where r2.user_id = r.user_id and 
              r2.score_date between '2012-01-01' and '2012-06-30'
      ) as rel_score
from (select distinct user_id
      from rel_score
      where score_date between '2012-01-01' and '2012-06-30'
     ) u
order by rel_score desc;

此查询背后的想法是用简单的索引查找替换聚合。MySQL 中的聚合是一个缓慢的操作——它在其他数据库中工作得更好，所以这样的技巧不应该是必要的。

mysql - MySql 表性能优化

2 回答 2

Related

Reference