代价高昂的部分是相关子查询必须计算每个表的每一行的时间差,以便在主查询中为一行的一列找到temperature_*
最接近的行。
如果您可以根据索引仅选择当前时间之后的一行和当前时间之前的一行,并且只计算这两个候选者的时间差,那么速度会大大加快。要使其快速运行,您只需要在表中的列上建立索引。time
我忽略了这个专栏zone
,因为它在问题中的作用仍然不清楚,它只会给核心问题增加更多的噪音。应该很容易添加到查询中。
如果没有其他视图,此查询会立即执行所有操作:
SELECT time
,COALESCE(temp1
,CASE WHEN timediff(time, time1a) > timediff(time1b, time) THEN
(SELECT t.temperature
FROM temperature_1 t
WHERE t.time = y.time1b)
ELSE
(SELECT t.temperature
FROM temperature_1 t
WHERE t.time = y.time1a)
END) AS temp1
,COALESCE(temp2
,CASE WHEN timediff(time, time2a) > timediff(time2b, time) THEN
(SELECT t.temperature
FROM temperature_2 t
WHERE t.time = y.time2b)
ELSE
(SELECT t.temperature
FROM temperature_2 t
WHERE t.time = y.time2a)
END) AS temp2
,COALESCE(temp3
,CASE WHEN timediff(time, time3a) > timediff(time3b, time) THEN
(SELECT t.temperature
FROM temperature_3 t
WHERE t.time = y.time3b)
ELSE
(SELECT t.temperature
FROM temperature_3 t
WHERE t.time = y.time3a)
END) AS temp3
FROM (
SELECT time
,max(t1) AS temp1
,max(t2) AS temp2
,max(t3) AS temp3
,CASE WHEN max(t1) IS NULL THEN
(SELECT t.time FROM temperature_1 t
WHERE t.time < x.time
ORDER BY t.time DESC LIMIT 1) ELSE NULL END AS time1a
,CASE WHEN max(t1) IS NULL THEN
(SELECT t.time FROM temperature_1 t
WHERE t.time > x.time
ORDER BY t.time LIMIT 1) ELSE NULL END AS time1b
,CASE WHEN max(t2) IS NULL THEN
(SELECT t.time FROM temperature_2 t
WHERE t.time < x.time
ORDER BY t.time DESC LIMIT 1) ELSE NULL END AS time2a
,CASE WHEN max(t2) IS NULL THEN
(SELECT t.time FROM temperature_2 t
WHERE t.time > x.time
ORDER BY t.time LIMIT 1) ELSE NULL END AS time2b
,CASE WHEN max(t3) IS NULL THEN
(SELECT t.time FROM temperature_3 t
WHERE t.time < x.time
ORDER BY t.time DESC LIMIT 1) ELSE NULL END AS time3a
,CASE WHEN max(t3) IS NULL THEN
(SELECT t.time FROM temperature_3 t
WHERE t.time > x.time
ORDER BY t.time LIMIT 1) ELSE NULL END AS time3b
FROM (
SELECT time, temperature AS t1, NULL AS t2, NULL AS t3 FROM temperature_1
UNION ALL
SELECT time, NULL AS t1, temperature AS t2, NULL AS t3 FROM temperature_2
UNION ALL
SELECT time, NULL AS t1, NULL AS t2, temperature AS t3 FROM temperature_3
) AS x
GROUP BY time
) y
ORDER BY time;
-> sqlfiddle
解释
suqquery x替换您的视图temptimes
并将温度带入结果。如果所有三个表都同步并且所有相同时间点的温度都相同,那么其余的甚至都不需要并且非常快。
对于三个表中的一个没有行的每个时间点,都按照指示获取温度:从每个表中获取“最接近”的一个。
suqquery y根据当前时间从每个缺少温度的表中聚合行x
并获取上一次(time1a
)和下一次( )。time1b
这些查找应该使用索引快速。
最后一个查询从实际缺失的每个温度的最接近时间的行中获取温度。
如果 MySQL 允许从高于当前子查询的一个以上级别引用列,则此查询可能会更简单。咬它不能。在PostgreSQL中工作得很好:->sqlfiddle
如果可以从相关子查询返回多于一列,它也会更简单,但我不知道如何在 MySQL 中做到这一点。
使用CTE和窗口函数会更简单,但 MySQL 不知道这些现代 SQL 特性(与其他相关的 RDBMS 不同)。