它可以在单个 SQL 语句中完成。这是证据。
设置
CREATE TEMP TABLE eventtable
(
name CHAR(3) NOT NULL,
time DATETIME HOUR TO MINUTE NOT NULL,
state CHAR(8) NOT NULL
);
INSERT INTO eventtable(name, time, state) VALUES('a', '10:00', 'login');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:05', 'login');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:06', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:08', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('a', '10:10', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('b', '10:12', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:01', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:02', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:03', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:04', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:05', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:06', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:07', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:08', 'Logout');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:09', 'login');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:11', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:12', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:13', 'chatting');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:14', 'Idle');
INSERT INTO eventtable(name, time, state) VALUES('c', '10:15', 'Logout');
正确查询
注意条件。结果表必须排除“登录”和第一个其他事件之间的时间段;此外,它必须排除“注销”和下一个事件(可能是“登录”)之间的时间段。名称列上的表之间的自连接和时间列上的非对称连接(使用' <
')确保事件按时间顺序排列。NOT EXISTS 子选择确保只考虑相邻事件。在子查询中使用 BETWEEN AND 是一个错误,因为它包含了它的端点,并且将r1.time
和r2.time
排除在范围之外是至关重要的;我花了几分钟才发现那个错误(查询运行但没有返回任何行,但为什么?)!
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS duration
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND NOT EXISTS (SELECT 1 FROM eventtable r3
WHERE r3.time > r1.time AND r3.time < r2.time
AND r3.name = r1.name
AND r3.name = r2.name);
这产生了答案:
name state duration
a chatting 0:02
a Idle 0:02
b chatting 0:01
c chatting 0:01
c Idle 0:01
c Idle 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
c chatting 0:01
c Idle 0:01
'duration' 值是一个 INTERVAL HOUR TO MINUTE;如果您想在几分钟内获得一个值,则必须使用强制转换来转换它(使用 4 表示精度,以允许长达 1440 分钟或 1 天的间隔;对于较长的时间范围,数据不明确):
(r2.time - r1.time)::INTERVAL MINUTE(4) TO MINUTE
或者:
CAST (r2.time - r1.time AS INTERVAL MINUTE(4) TO MINUTE)
IBM Informix Dynamic Server (IDS) 有非常冗长的时间常数表示法。在标准 SQL 中,您可以使用 TIME 作为类型,使用 TIME '10:00:00' 作为值,但在严格的标准 SQL 中,秒是必需的。IDS 确实提供了人们想要的确切类型 - 例如 DATETIME HOUR TO MINUTE。您还可以在标准 SQL 中编写 INTERVAL MINUTE(4);'TO MINUTE' 应该是可选的。
查询不正确
在我对 Ray Hidayat 的回答的评论中,我指出 EXISTS 子查询对于确保正在考虑的事件是连续的 - 没有中间事件是必要的。这是在输出中添加了开始和结束时间的相同查询,并且缺少 EXISTS 子句(并且 'duration' 重命名为 'lapse'):
SELECT r1.name, r1.state, r2.TIME - r1.TIME AS lapse,
r1.time AS start, r2.time AS end
FROM eventtable r1, eventtable r2
WHERE r1.name = r2.name
AND r1.time < r2.time
AND r1.state != 'login'
AND r1.state != 'Logout'
AND r1.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE
AND r2.time BETWEEN DATETIME(10:00) HOUR TO MINUTE
AND DATETIME(10:15) HOUR TO MINUTE;
这产生了答案:
name state lapse start end
a chatting 0:04 10:06 10:10
a chatting 0:02 10:06 10:08
a Idle 0:02 10:08 10:10
b chatting 0:01 10:11 10:12
c chatting 0:13 10:02 10:15
c chatting 0:12 10:02 10:14
c chatting 0:11 10:02 10:13
c chatting 0:10 10:02 10:12
c chatting 0:09 10:02 10:11
c chatting 0:07 10:02 10:09
c chatting 0:06 10:02 10:08
c chatting 0:05 10:02 10:07
c chatting 0:04 10:02 10:06
c chatting 0:03 10:02 10:05
c chatting 0:02 10:02 10:04
c chatting 0:01 10:02 10:03
c Idle 0:12 10:03 10:15
c Idle 0:11 10:03 10:14
c Idle 0:10 10:03 10:13
c Idle 0:09 10:03 10:12
c Idle 0:08 10:03 10:11
c Idle 0:06 10:03 10:09
c Idle 0:05 10:03 10:08
c Idle 0:04 10:03 10:07
c Idle 0:03 10:03 10:06
c Idle 0:02 10:03 10:05
c Idle 0:01 10:03 10:04
c Idle 0:10 10:05 10:15
c Idle 0:09 10:05 10:14
c Idle 0:08 10:05 10:13
c Idle 0:07 10:05 10:12
c Idle 0:06 10:05 10:11
c Idle 0:04 10:05 10:09
c Idle 0:03 10:05 10:08
c Idle 0:02 10:05 10:07
c Idle 0:01 10:05 10:06
c Idle 0:08 10:07 10:15
c Idle 0:07 10:07 10:14
c Idle 0:06 10:07 10:13
c Idle 0:05 10:07 10:12
c Idle 0:04 10:07 10:11
c Idle 0:02 10:07 10:09
c Idle 0:01 10:07 10:08
c chatting 0:04 10:11 10:15
c chatting 0:03 10:11 10:14
c chatting 0:02 10:11 10:13
c chatting 0:01 10:11 10:12
c Idle 0:03 10:12 10:15
c Idle 0:02 10:12 10:14
c Idle 0:01 10:12 10:13
c chatting 0:02 10:13 10:15
c chatting 0:01 10:13 10:14
c Idle 0:01 10:14 10:15
这显示了用户“c”的每个符合条件的起始行如何与每个符合条件的结束行匹配,从而产生许多虚假数据行。NOT EXISTS 子查询是处理基于时间的查询时的常见主题。您可以在 Snodgrass 的“在 SQL 中开发面向时间的应用程序”(在 URL 上在线提供 PDF)以及在 Date、Darwen 和 Lorentzos的“时间数据和关系模型”中找到有关这些操作的信息。