1

我正在使用 Postgres 9.2。

我有以下问题:

Time | Value | Device   --  Sum should be
1      v1      1              v1 
2      v2      2              v1 + v2 
3      v3      3              v1 + v2 + v3 
4      v4      2              v1 + v4 + v3
5      v5      2              v1 + v5 + v3
6      v6      1              v6 + v5 + v3
7      v7      3              v6 + v5 + v3

本质上,总和需要跨越 N 个设备中每个设备的最新时间值。在上面的示例中,有 3 个设备。

我尝试了几种使用窗口函数的方法,但均未成功。我已经编写了一个存储过程来满足我的需要,但它很慢。缓慢可能是我对 plpgsql 缺乏经验。

CREATE OR REPLACE FUNCTION timeseries.combine_series(id int[], startTime timestamp, endTime timestamp) 
RETURNS setof RECORD AS $$
DECLARE
    retval double precision = 0;
    row_data timeseries.total_active_energy%ROWTYPE;
    maxCount integer = 0;
    sz integer = 0;
lastVal double precision[];
v_rec RECORD;
BEGIN   
    SELECT INTO sz array_length($1,1);

    FOR row_data IN SELECT * FROM timeseries.total_active_energy  WHERE time >= startTime AND time < endTime AND device_id = ANY($1) ORDER BY time
       LOOP
    retval = row_data.active_power;
    for i IN 1..sz LOOP
        IF $1[i]=row_data.device_id THEN
            lastVal[i] = row_data.active_power;
        ELSE
            retval = retVal + COALESCE(lastVal[i],0);
        END IF;
    END LOOP;

    SELECT row_data.time, retval into v_rec;

    return next v_rec;
     END LOOP;

      return ;
  END;
$$ LANGUAGE plpgsql;

称呼:

select * from timeseries.combine_series('{552,553,554}'::int[], '2013-05-01'::timestamp, '2013-05-02'::timestamp) 
    AS (t timestamp with time zone, val double precision);

样本数据

CREATE OR REPLACE TEMP TABLE t (ts int, active_power real, device_id int, should_be int);

INSERT INTO t VALUES
 (1,2,554,2)
,(2,3,553,5)
,(3,9,553,11)
,(4,7,553,9)
,(5,6,552,15)
,(6,8,554,21)
,(7,5,553,19)
,(8,7,553,21)
,(9,6,552,21)
,(10,7,552,22)
;
4

2 回答 2

2

我正在建立我对您之前的问题的回答,您在其中提出了一个更简单的案例。阅读那里以了解解决方案的窗口函数方面的说明:
Sum across partitions with window functions

这个问题提出了一个“反交叉列表”数据集。要到达您想要的位置,您可以先运行交叉表,将案例简化为更简单的先前形式。
PostgreSQL 有额外的模块tablefunc 为此提供非常快速的功能。每个数据库运行一次此命令以安装:

CREATE EXTENSION tablefunc;

那么你所需要的就是这个(包括调试结果中的冗余列):

SELECT ts, active_power, device_id, should_be
       , COALESCE(max(a) OVER (PARTITION BY grp_a), 0)
       + COALESCE(max(b) OVER (PARTITION BY grp_b), 0)
       + COALESCE(max(c) OVER (PARTITION BY grp_c), 0) AS special_sum
FROM  (
   SELECT *
         ,count(a) OVER w AS grp_a
         ,count(b) OVER w AS grp_b
         ,count(c) OVER w AS grp_c
   FROM   crosstab(
            'SELECT ts, active_power, device_id, should_be
                   ,device_id, active_power
             FROM   t
             ORDER  BY 1,2'

            ,'VALUES (552), (553), (554)'
         ) AS t (ts int, active_power int, device_id int, should_be int
                ,a int, b int, c int)
   WINDOW w AS (ORDER BY ts)
   ) sub
ORDER  BY ts;

返回所需的结果。
在这个查询中组装了相当多的炸药,它应该表现良好。
请注意,此解决方案建立在一个小的给定设备列表上 -(552, 553, 554)在您的示例中。

基础知识crosstab()
PostgreSQL 交叉表查询

关于额外的列:
使用 Tablefunc 在多个列上进行透视

高级 crosstab-foo:
使用 CASE 和 GROUP BY 的动态替代方案

于 2013-06-20T16:28:06.777 回答
0

如果您知道“N”的值,则以下方法有效。它计算time每个设备的最大值,然后加入原始记录,并使用聚合对它们求和:

select tae.time, tae.value, taw.device,
       SUM(coalesce(dev.value)) as sumvalue
from (select t.*,
             MAX(case when device = 1 then time end) over (order by time) as dev1time,
             MAX(case when device = 2 then time end) over (order by time) as dev2time,
             MAX(case when device = 3 then time end) over (order by time) as dev3time
      from timeseries.total_active_energy tae
     ) tae left outer join
     timeseries.total_active_energy dev
     on dev.time in (dev1time, dev2time, dev3time)
group by tae.time, taw.value, tae.device;
于 2013-06-20T17:21:12.533 回答