2

使用 volkszaehler.org 我需要从一百万多行表中检索数据,以下是 ORM 创建的内容:

CREATE TABLE `data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `channel_id` int(11) DEFAULT NULL,
  `timestamp` bigint(20) NOT NULL,
  `value` double NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `ts_uniq` (`channel_id`,`timestamp`),
  KEY `IDX_ADF3F36372F5A1AA` (`channel_id`)
)

现在,选择分组数据很慢,尤其是在 Raspberry Pi 等低性能平台上运行时:

SELECT MAX(timestamp) AS timestamp, SUM(value) AS value, COUNT(timestamp) AS count 
FROM data WHERE channel_id = 4 AND timestamp >= 1356994800000 AND timestamp <= 1375009341000 
GROUP BY YEAR(FROM_UNIXTIME(timestamp/1000)), DAYOFYEAR(FROM_UNIXTIME(timestamp/1000));

解释:

SIMPLE  data    ref ts_uniq,IDX_ADF3F36372F5A1AA    ts_uniq 5   const   2066    Using where; Using temporary; Using filesort

查询需要经过 50k 条记录,在 Core i5 上需要 1.5 秒,在 RasPi 上已经需要 6 秒。

除了减少数据量之外,还有什么可以提高性能的吗?

4

1 回答 1

1

增加而不是减少数据量,这就是您所需要的:您在 GROUP BY 子句中有两个函数,如果您事先计算YEAR(FROM_UNIXTIME(timestamp/1000))DAYOFYEAR(FROM_UNIXTIME(timestamp/1000))在触发器中将值存储到其他字段,您的 SELECT 语句会快得多。

除此之外,您可以timestamp通过将其除以 1000*3600*24=86400000 并仅按一个字段分组来简单地截断到最近的一天,因为我看不到分别按年份和年份分组的点,当你可以仅按日期分组:

SELECT 
 MAX(timestamp) AS timestamp, 
 SUM(value) AS value, 
 COUNT(timestamp) AS count 
FROM data WHERE 
 channel_id = 4 AND 
 timestamp >= 1356994800000 AND 
 timestamp <= 1375009341000 
GROUP BY timestamp/86400000;

就我个人而言,在那之后我会添加日期字段、索引它并在触发器中更新它,这样我就可以从 GROUP BY 中删除所有算术表达式。在这种情况下,将使用索引。

于 2013-07-28T11:36:18.577 回答