我有一张代表产品使用情况的表格,有点像日志。产品使用记录为多个时间戳,我想使用时间范围表示相同的数据。
它看起来像这样(PostgreSQL 9.1):
userid | timestamp | product
-------------------------------------
001 | 2012-04-23 9:12:05 | foo
001 | 2012-04-23 9:12:07 | foo
001 | 2012-04-23 9:12:09 | foo
001 | 2012-04-23 9:12:11 | barbaz
001 | 2012-04-23 9:12:13 | barbaz
001 | 2012-04-23 9:15:00 | barbaz
001 | 2012-04-23 9:15:01 | barbaz
002 | 2012-04-24 3:41:01 | foo
002 | 2012-04-24 3:41:03 | foo
我想折叠与上一次运行的时间差小于增量(例如:2 seconds)的行,并获取开始时间和结束时间,如下所示:
userid | begin | end | product
----------------------------------------------------------
001 | 2012-04-23 9:12:05 | 2012-04-23 9:12:09 | foo
001 | 2012-04-23 9:12:11 | 2012-04-23 9:12:13 | barbaz
001 | 2012-04-23 9:15:00 | 2012-04-23 9:15:01 | barbaz
002 | 2012-04-24 3:41:01 | 2012-04-24 3:41:03 | foo
请注意,如果同一产品的使用时间间隔超过delta(在本例中为 2 秒),则连续使用同一产品将分为两行。
create table t (userid int, timestamp timestamp, product text);
insert into t (userid, timestamp, product) values
(001, '2012-04-23 9:12:05', 'foo'),
(001, '2012-04-23 9:12:07', 'foo'),
(001, '2012-04-23 9:12:09', 'foo'),
(001, '2012-04-23 9:12:11', 'barbaz'),
(001, '2012-04-23 9:12:13', 'barbaz'),
(001, '2012-04-23 9:15:00', 'barbaz'),
(001, '2012-04-23 9:15:01', 'barbaz'),
(002, '2012-04-24 3:41:01', 'foo'),
(002, '2012-04-24 3:41:03', 'foo')
;