4

此查询的执行时间超过 2 秒(对于 10k 行)。是否可以优化此查询?

SELECT id, MIN(ABS(timestamp_a - timestamp_b))
FROM a 
  INNER JOIN b ON ( timestamp_a  between (timestamp_b - 5 * 60) 
              AND (timestmap_b + 5 * 60) )
GROUP BY id

示例结果(id、timestamp_a、timestamp_b、diff):

1   1349878538  1349878539  1
2   1349878679  1349878539  2
3   1349878724  1349878539  1
5   1349878836  1349878539  1
6   1349878890  1349878641  1

表一

CREATE TABLE `a` (
`id`  int(11) NOT NULL AUTO_INCREMENT ,
`timestamp_a`  bigint(20) NULL DEFAULT NULL ,
PRIMARY KEY (`id`),
INDEX `a` (`timestamp_a`) USING BTREE 
)

表b

CREATE TABLE `b` (
`id`  int(11) NOT NULL AUTO_INCREMENT ,
`timestamp_b`  bigint(20) NULL DEFAULT NULL ,
PRIMARY KEY (`id`),
INDEX `b` (`timestamp_b`) USING BTREE 
)

两个表之间不相关 - 我从表 'a' 中搜索位于表 'b' 中的时间戳之间的记录。

编辑:简单的解决方案(运行非常快):

SELECT id, MIN(ABS(timestamp_a - timestamp_b))
FROM (SELECT id, timestamp, (timestamp - 5 * 60) timestamp_a, (timestamp + 5 * 60) timestamp_b) a
INNER JOIN b ON ( timestamp between timestamp_a AND timestamp_b )
GROUP BY id
4

1 回答 1

0

采用 Michael 对修改的时间戳列的约定,此查询将产生原始查询的预期结果,并具有上述“更快”查询的性能:

SELECT a.id, MIN(ABS(a.timestamp_a - tmp_b.timestamp_b))
FROM (SELECT id, timestamp_b, (timestamp_b - 5 * 60) timestamp_b_minus, (timestamp_b + 5 * 60) timestamp_b_plus) tmp_b
INNER JOIN a ON ( a.timestamp_a between tmp_b.timestamp_b_minus AND tmp_b.timestamp_b_plus )
GROUP BY a.id

原始查询是体验性能限制的原因是,由于 ON 子句中使用的公式,RDBMS 被迫b对每一行执行全表扫描。a

即使“更快”查询需要全表扫描b来生成“临时”表tmp_b,它也能够使用索引来根据条件a.timestamp_a从中提取适当的值atmp_b.timestamp_b_minus AND tmp_b.timestamp_b_plus

于 2013-01-04T16:24:40.257 回答