6

我的问题很简单:我有一个包含一系列状态和时间戳的表(出于好奇,这些状态表示警报级别),我想查询此表以获得两个状态之间的持续时间。

看起来很简单,但棘手的部分来了:我不能创建查找表、过程,它应该尽可能快,因为这个表是一个拥有超过 10 亿条记录的小怪物(不开玩笑!)......

架构非常简单:

[pk] 时间价值

(其实还有第二个pk但是这个没用)

下面是一个真实世界的例子:

时间戳状态
2013-1-1 00:00:00 1
2013-1-1 00:00:05 2
2013-1-1 00:00:10 2
2013-1-1 00:00:15 2
2013-1-1 00:00:20 0
2013-1-1 00:00:25 1
2013-1-1 00:00:30 2
2013-1-1 00:00:35 2
2013-1-1 00:00:40 0

仅考虑 2 级警报的输出应如下报告 2 级警报的开始和结束(当达到 0 时):

开始时间结束时间间隔
2013-1-1 00:00:05 2013-1-1 00:00:20 15
2013-1-1 00:00:30 2013-1-1 00:00:40 10

我一直在尝试各种内部连接,但所有这些都将我引向了惊人的笛卡尔爆炸。你们能帮我想出一个方法来完成这个吗?

谢谢!

4

4 回答 4

4

这一定是我今天看到的最难的问题之一——谢谢!我假设您可以使用 CTE?如果是这样,请尝试以下操作:

;WITH Filtered
AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY dateField) RN, dateField, Status
    FROM Test    
)
SELECT F1.RN, F3.MinRN,
    F1.dateField StartDate,
    F2.dateField Enddate
FROM Filtered      F1, Filtered F2, (
SELECT F1a.RN, MIN(F3a.RN) as MinRN
FROM Filtered      F1a
   JOIN Filtered F2a ON F1a.RN = F2a.RN+1 AND F1a.Status = 2 AND F2a.Status <> 2
   JOIN Filtered F3a ON F1a.RN < F3a.RN AND F3a.Status <> 2
GROUP BY F1a.RN ) F3 
WHERE F1.RN = F3.RN AND F2.RN = F3.MinRN

还有小提琴。我没有添加间隔,但我想你可以从这里处理那部分。

祝你好运。

于 2013-01-24T20:51:14.527 回答
0

终于找到我满意的版本了。我想起了另一个问题的答案(虽然不记得是哪一个),其中指出两个(增加的)序列之间的差异始终是一个常数。

WITH Ordered (occurredAt, status, row, grp) 
             as (SELECT occurredAt, status, 
                        ROW_NUMBER() OVER (ORDER BY occurredat), 
                        ROW_NUMBER() OVER (PARTITION BY status 
                                           ORDER BY occurredat)
                 FROM Alert)

SELECT Event.startDate, Ending.occurredAt as endDate,
       DATEDIFF(second, Event.startDate, Ending.occurredAt) as interval

FROM (SELECT MIN(occurredAt) as startDate, MAX(row) as ending
      FROM Ordered
      WHERE status = 2
      GROUP BY row - grp) Event

LEFT JOIN (SELECT occurredAt, row
           FROM Ordered
           WHERE status != 2) Ending
        ON Event.ending + 1 = Ending.row

(工作SQL Fiddle 示例,带有一些额外的数据行用于工作检查)。

不幸的是,这不能正确处理作为结束行(行为未指定)的 2 级状态,尽管它确实列出了它们。

于 2013-01-24T21:42:49.837 回答
0

只是为了有一个替代品。试图对性能进行一些测试,但没有完成。

SELECT
  MIN([main].[Start]) AS [Start],
  [main].[End],
  DATEDIFF(s, MIN([main].[Start]), [main].[End]) AS [Seconds]
FROM
(
  SELECT
    [sub].[Start],
    MIN([sub].[End]) AS [End]
  FROM
  (
    SELECT
      [start].[Timestamp] AS [Start],
      [start].[Status] AS [StartingStatus],
      [end].[Timestamp] AS [End],
      [end].[Status] AS [EndingStatus]
    FROM [Alerts] [start],  [Alerts] [end]
    WHERE [start].[Status] = 2 
      AND [start].[Timestamp] < [end].[Timestamp]
      AND [start].[Status] <> [end].[Status]
  ) AS [sub]
  GROUP BY
    [sub].[Start],
    [sub].[StartingStatus]
) AS [main]
GROUP BY
  [main].[End]

这是一个Fiddle

于 2013-01-24T22:29:42.043 回答
-1

我通过使用作为表标识的 id 来做类似的事情。

    create table test(id int primary key identity(1,1),timstamp datetime,val int)

    insert into test(timstamp,val) Values('1/1/2013 00:00:00',1)
    insert into test(timstamp,val) Values('1/1/2013 00:00:05',2)
    insert into test(timstamp,val) Values('1/1/2013 00:00:25',1)
    insert into test(timstamp,val) Values('1/1/2013 00:00:30',2)
    insert into test(timstamp,val) Values('1/1/2013 00:00:35',1)

    select t1.timstamp,t1.val,DATEDIFF(s,t1.timstamp,t2.timstamp) 
    from test t1 left join test t2 on t1.id=t2.id-1

    drop table test

我还将时间戳设为自 1980 年或 2000 年以来的秒数。但是您可能不想一直进行反向转换,因此这取决于您使用实际时间戳的频率。

于 2013-01-24T21:34:06.023 回答