0

我有一张记录感兴趣事件的开始时间和结束时间的表格:

CREATE TABLE event_log (start_time DATETIME, end_time DATETIME);
INSERT INTO event_log VALUES ("2013-06-03 09:00:00","2013-06-03 09:00:05"), ("2013-06-03 09:00:03","2013-06-03 09:00:07"), ("2013-06-03 09:00:10","2013-06-03 09:00:12");

+---------------------+---------------------+
| start_time          | end_time            |
+---------------------+---------------------+
| 2013-06-03 09:00:00 | 2013-06-03 09:00:05 |
| 2013-06-03 09:00:03 | 2013-06-03 09:00:07 |
| 2013-06-03 09:00:10 | 2013-06-03 09:00:12 |
+---------------------+---------------------+

我正在寻找一种创建“时间序列”表的方法,其中一列是时间索引,另一列是当时正在进行的事件计数。我可以使用子查询和生成器来做到这一点:

SET @first_time := (SELECT MIN(start_time) FROM event_log);
SET @last_time := (SELECT MAX(end_time) FROM event_log);

CREATE OR REPLACE VIEW generator_16
AS SELECT 0 n UNION ALL SELECT 1  UNION ALL SELECT 2  UNION ALL 
   SELECT 3   UNION ALL SELECT 4  UNION ALL SELECT 5  UNION ALL
   SELECT 6   UNION ALL SELECT 7  UNION ALL SELECT 8  UNION ALL
   SELECT 9   UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL
   SELECT 12  UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL 
   SELECT 15;

CREATE TABLE time_series (t DATETIME, event_count INT(11))
SELECT @first_time + INTERVAL n SECOND t, NULL AS event_count
  FROM generator_16
  WHERE @first_time + INTERVAL n SECOND <= @last_time;

UPDATE time_series 
  SET event_count= (SELECT COUNT(*) FROM event_log 
  WHERE start_time<=t AND end_time>=t);

+---------------------+-------------+
| t                   | event_count |
+---------------------+-------------+
| 2013-06-03 09:00:00 |           1 |
| 2013-06-03 09:00:01 |           1 |
| 2013-06-03 09:00:02 |           1 |
| 2013-06-03 09:00:03 |           2 |
| 2013-06-03 09:00:04 |           2 |
| 2013-06-03 09:00:05 |           2 |
| 2013-06-03 09:00:06 |           1 |
| 2013-06-03 09:00:07 |           1 |
| 2013-06-03 09:00:08 |           0 |
| 2013-06-03 09:00:09 |           0 |
| 2013-06-03 09:00:10 |           1 |
| 2013-06-03 09:00:11 |           1 |
| 2013-06-03 09:00:12 |           1 |
+---------------------+-------------+

有没有更有效的方法来做到这一点?此方法要求每个时间索引都有一个子查询。例如,是否有一种方法需要每个“event_log”记录一个子查询?我真正的问题是 500k 时间索引条目和 1k 事件;这比我想要的要长一点(大约 90 秒)。

“生成器”片段来自http://use-the-index-luke.com/blog/2011-07-30/mysql-row-generator。显然,较大的问题需要使用较大的生成器之一,例如 64k 版本或 1M 版本。

4

1 回答 1

0

唯一的变化发生在 start_time 和 end_time。所以,如果你要

select distinct start_time As time_point from event_log 
UNION 
select distinct   end_time As time_point from event_log

...这将为您提供需要快照的所有“点”。

如果您在临时表(例如 TEMP_POINTS)中创建它,并在返回 event_log 时加入,您应该能够计算每个“点”的事件数。

CREATE TABLE NON_ZERO_POINTS (t DATETIME, event_count INT(11))
    select time_point, count(*)
    from TEMP_POINTS 
    join event_log on time_point between start_time and end_time
    group by time_point

可能值得在 NON_ZERO_POINTS 上创建索引

然后,您可以在更新中使用 NON_ZERO_POINTS :

UPDATE time_series 
SET event_count= (SELECT event_count FROM NON_ZERO_POINTS
WHERE t=time_point);

另外,您需要更新 time_series 吗?如果没有,您可以在查询中使用它:

select t, coalesce(event_count)
from time_series 
left join FROM NON_ZERO_POINTS
on t=time_point
于 2013-06-03T15:31:43.937 回答