2

我想在时间戳上聚合一列。

这里有一个例子:

表包含 col1、col2、...、col_ts 等列(时间戳列)。

SELECT
SUM(col1) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM1,
SUM(col2) OVER (ORDER BY col_ts ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) SUM2
FROM ...

现在,当时间戳之间的差异 <= 5 分钟时,我只想要 2 个 PRECEDING 和 2 个 FOLLOWING ROWS 求和。

例如,让我们看看这些时间戳值:

14.09.15 15:44:00
14.09.15 15:50:00
14.09.15 15:51:00
14.09.15 15:52:00
14.09.15 15:53:00

什么时候在时间戳值为“14.09.15 15:51:00”的行中,我希望对从 15:50 到 15:53 的值求和,因为 15:50 和 15:44 之间的差异更大超过 5 分钟。

有没有办法在 over 子句中写出这样的条件?

或者有没有人对此有一个好的和高性能的解决方案?

4

2 回答 2

1

好的,我在这里看到了问题。谢谢弗洛林。那么一些预处理呢?我可以找到解决方案,但我不确定是否有更快的解决方案:

select col_ts, 
       n, 
       SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN LEFT_VALUE PRECEDING AND RIGHT_VALUE FOLLOWING) MY_SUM,
       SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN interval '5' second PRECEDING AND interval '5' second FOLLOWING) OLD_SUM
from (
       select col_ts,
              n,
              CASE
              WHEN (LEAD(col_ts,1) OVER (ORDER BY col_ts ) - col_ts) <= INTERVAL '5' second 
              THEN 
                   CASE
                   WHEN (LEAD(col_ts,2) OVER (ORDER BY col_ts ) - LEAD(col_ts,1) OVER (ORDER BY col_ts )) <= INTERVAL '5' second 
                   THEN 2 
                   ELSE 1
                   END
             ELSE 0
             END AS RIGHT_VALUE,
             CASE 
             WHEN (col_ts - LAG(col_ts,1) OVER (ORDER BY col_ts ) ) <= INTERVAL '5' second 
             THEN 
                  CASE 
                  WHEN (LAG(col_ts,1) OVER (ORDER BY col_ts ) - LAG(col_ts,2) OVER (ORDER BY col_ts )) <= INTERVAL '5' second 
                  THEN 2 
                  ELSE 1
                  END
            ELSE 0
            END AS LEFT_VALUE
      from fg_test
  );

结果:

COL_TS                           N   MY_SUM      OLD_SUM
---------------------------  -----  -------  -----------
15.09.15 09:34:24,069000000      1        6            6
15.09.15 09:34:28,000000000      2       10           15
15.09.15 09:34:29,000000000      3       15           15
15.09.15 09:34:30,000000000      4       14           14
15.09.15 09:34:31,000000000      5       12           14
15.09.15 09:34:37,000000000      6        6            6

你怎么看?

于 2015-09-15T08:32:33.643 回答
0

我认为这对sql来说太多了。您可以限制窗口中的数量或元素,您可以以某种方式(见下文)限制值,但不能同时限制两者。

drop table fg_test;
create table fg_test(col_ts timestamp, n number);

insert into fg_test values (systimestamp, 1);
insert into fg_test values (systimestamp+4/1440/60, 2);
insert into fg_test values (systimestamp+5/1440/60, 3);
insert into fg_test values (systimestamp+6/1440/60, 4);
insert into fg_test values (systimestamp+7/1440/60, 5);
insert into fg_test values (systimestamp+13/1440/60, 6);

select col_ts, n, 
  SUM(n) OVER (ORDER BY col_ts ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) SUM1,
  SUM(n) OVER (ORDER BY col_ts RANGE BETWEEN current row AND interval '5' second FOLLOWING) SUMNEW
from fg_test;

结果:

COL_TS                                   N       SUM1       SUM2
------------------------------- ---------- ---------- ----------
14-SEP-15 06.16.28.825395000 PM          1          3          3 
14-SEP-15 06.16.33.000000000 PM          2          6         14 
14-SEP-15 06.16.34.000000000 PM          3          9         12 
14-SEP-15 06.16.35.000000000 PM          4         12          9 
14-SEP-15 06.16.36.000000000 PM          5         15          5 
14-SEP-15 06.16.42.000000000 PM          6         11          6 

(很抱歉没有像您的问题那样举出确切的例子)

另一种方法是编写一些 PL/SQL(打开游标并进行一些处理)。

于 2015-09-14T15:19:51.850 回答