sql - 按浮动日期范围分组

Question

我正在使用 PostgreSQL 9.2。
我有一个表格，其中包含一些设备停止服务的时间。

+----------+----------+---------------------+
| event_id |  device  |         time        |
+----------+----------+---------------------+
|        1 | Switch4  | 2013-09-01 00:01:00 |
|        2 | Switch1  | 2013-09-01 00:02:30 |
|        3 | Switch10 | 2013-09-01 00:02:40 |
|        4 | Switch51 | 2013-09-01 03:05:00 |
|        5 | Switch49 | 2013-09-02 13:00:00 |
|        6 | Switch28 | 2013-09-02 13:01:00 |
|        7 | Switch9  | 2013-09-02 13:02:00 |
+----------+----------+---------------------+

我希望按 +/-3 分钟的时差对行进行分组，如下所示：

+----------+----------+---------------------+--------+
| event_id |  device  |         time        |  group |
+----------+----------+---------------------+--------+
|        1 | Switch4  | 2013-09-01 00:01:00 |      1 |
|        2 | Switch1  | 2013-09-01 00:02:30 |      1 |
|        3 | Switch10 | 2013-09-01 00:02:40 |      1 |
|        4 | Switch51 | 2013-09-01 03:05:00 |      2 |
|        5 | Switch49 | 2013-09-02 13:00:00 |      3 |
|        6 | Switch28 | 2013-09-02 13:01:00 |      3 |
|        7 | Switch9  | 2013-09-02 13:02:00 |      3 |
+----------+----------+---------------------+--------+

我尝试使用窗口函数来制作它，但是在子句中

[ 范围 | ROWS ] 在 frame_start 和 frame_end 之间，其中 frame_start 和 frame_end 可以是 UNBOUNDED PRECEDING 值 PRECEDING CURRENT ROW value FOLLOWING UNBOUNDED FOLLOWING 之一，

value 必须是不包含任何变量、聚合函数或窗口函数的整数表达式

因此，考虑到这一点，我无法指出时间间隔。现在我怀疑窗口功能可以解决我的问题。你可以帮帮我吗？

score 5 · Accepted Answer

SQL小提琴

select
    event_id, device, ts,
    floor(extract(epoch from ts) / 180) as group
from t
order by ts

可以使用窗口函数将组号设为从 1 开始的序列，但这是一个不小的成本，我不知道是否有必要。就是这个

select
    event_id, device, ts,
    dense_rank() over(order by "group") as group
from (
    select
        event_id, device, ts,
        floor(extract(epoch from ts) / 180) as group
    from t
) s
order by ts

time是保留字。选择另一个作为列名。

score 1 · Accepted Answer

这只是对@Clodoaldo 的基本好的答案的轻微改进。

要获得连续的组号：

SELECT event_id, device, ts
      ,dense_rank() OVER (ORDER BY trunc(extract(epoch from ts) / 180)) AS grp
FROM   tbl
ORDER  BY ts

使用ts而不是（部分）保留字time是很好的建议。所以也不要使用保留字 group。改为使用grp。
没有子查询就可以有序号。
使用trunc()而不是floor(). 两者都很好，trunc()是稍快一些。

score 1 · Accepted Answer

SQLFiddle

with u as (
select 
   *,
   extract(epoch from ts - lag(ts) over(order by ts))/ 60 > 180 or lag(ts) over(order by ts) is null as test
from
   t
   )

   select *, sum(test::int) over(order by ts) from u

score 0 · Accepted Answer

Make function

CREATE OR REPLACE FUNCTION public.date_round (
  base_date timestamp,
  round_interval interval
)
RETURNS TIMESTAMP WITHOUT TIME ZONE AS
$body$
DECLARE
   res TIMESTAMP;
BEGIN   
    res := TIMESTAMP 'epoch' + (EXTRACT(epoch FROM $1)::INTEGER + EXTRACT(epoch FROM $2)::INTEGER / 2)
                / EXTRACT(epoch FROM $2)::INTEGER * EXTRACT(epoch FROM $2)::INTEGER * INTERVAL '1 second';            
    IF (base_date > res ) THEN
        res := res + $2;
    END IF;
    RETURN res;
END;
$body$
LANGUAGE 'plpgsql'
STABLE
CALLED ON NULL INPUT
SECURITY INVOKER
COST 100;

And group by this function result

SELECT t.* FROM (SELECT p.oper_date, date_round(p.oper_date, '5 minutes') as grp FROM test p) t GROUP BY t.grp

This easy :)

score 0 · Accepted Answer

http://www.depesz.com/2010/09/12/how-to-group-messages-into-chats/

应该使用窗口。这是教科书中的一个例子

with
  xinterval( val ) as ( select 2 ),
  data( id, t ) as 
  (
    values  

      ( 1000, 1 ),
      ( 1001, 2 ),
      ( 1002, 3 ),

      ( 1000, 7 ),
      ( 1003, 8 )

  ),  
  x( id, t, tx ) as
  (
    select id, t,
      case (t - lag(t) over (order by t)) > xinterval.val
        when true then t when null then t
      end
    from data natural join xinterval
  ),
  xx( id, t, t2 ) as
  (
    select id, t, max(tx) over (order by t) from x
  )
select id, t, text( min(t) over w ) || '-' || text( max(t) over w ) as xperiod
from xx
window w as ( partition by t2 )
order by t

sql - 按浮动日期范围分组

5 回答 5

Related

Reference