0

我有一个结构表:

id, timestamp, deviceId, datatype, measure

列度量值表示数据类型的值。例如,当处理开始时,数据类型为 19,度量为 1。当处理完成时,数据类型仍为 19,值为 0,并插入具有相同时间戳、数据类型 54 和值作为某个值的新行。这意味着在完成时系统正在调用一些触发器来更新此表。下面的示例数据

1001, 2013-01-02 09:20:00, 501, 19, 1
1005, 2013-01-02 10:00:00, 501, 19, 0
1006, 2013-01-02 10:00:00, 501, 54, 65

1005和1006的时间戳相同,1001的时间戳总是小于1005的

1011, 2013-01-02 09:20:00, 601, 19, 1
1015, 2013-01-02 10:00:00, 601, 19, 0
1016, 2013-01-02 10:00:00, 601, 54, 105

1015和1016的时间戳相同,1011的时间戳总是小于1015的

1021, 2013-01-02 09:20:00, 701, 19, 1
1022, 2013-01-02 10:00:00, 701, 19, 0
1023, 2013-01-02 10:00:00, 701, 54, 81

1022和1023的时间戳相同,1021的时间戳总是小于1022的

同样的过程可以同时发生在多个设备上。

现在的要求是找到每个已完成事务的开始和结束时间,例如

1006, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 501, 65
1016, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 601, 105
1023, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 701, 81

大约 5 年后,我正在编写 SQL 查询并且完全卡住了。任何指针/建议将不胜感激。

提前致谢

4

3 回答 3

2

SQL小提琴

CREATE TABLE t
        (id int, ts timestamp, deviceId int, datatype int, measure int)
;

INSERT INTO t
        (id, ts, deviceId, datatype, measure)
VALUES
        (1001, '2013-01-02 09:20:00', 501, 19, 1),
        (1005, '2013-01-02 10:00:00', 501, 19, 0),
        (1006, '2013-01-02 10:00:00', 501, 54, 65),
        (1007, '2013-01-02 10:20:00', 501, 19, 1),
        (1008, '2013-01-02 11:00:00', 501, 19, 0),
        (1009, '2013-01-02 11:00:00', 501, 54, 65),
        (1011, '2013-01-02 09:20:00', 601, 19, 1),
        (1015, '2013-01-02 10:00:00', 601, 19, 0),
        (1016, '2013-01-02 10:00:00', 601, 54, 105),
        (1021, '2013-01-02 09:20:00', 701, 19, 1),
        (1022, '2013-01-02 10:00:00', 701, 19, 0),
        (1023, '2013-01-02 10:00:00', 701, 54, 81)
;

with parted as (
    select floor((rn - 1) / 2.0) p, *
    from (
        select
            row_number() over (partition by deviceId order  by ts, datatype) rn,
            id, ts, deviceId, dataType, measure
        from t
        where not(datatype = 19 and measure = 0)
    ) s
)
select
    p1.id, p0.ts "start", p1.ts "end", p1.deviceId, p1.measure
from
    parted p0
    inner join
    parted p1 on
        p0.deviceId = p1.deviceId
        and p0.p = p1.p
        and p0.datatype = 19 and p1.datatype = 54
order by p1.id
;
  id  |        start        |         end         | deviceid | measure 
------+---------------------+---------------------+----------+---------
 1006 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      501 |      65
 1009 | 2013-01-02 10:20:00 | 2013-01-02 11:00:00 |      501 |      65
 1016 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      601 |     105
 1023 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 |      701 |      81
于 2013-01-02T13:45:48.120 回答
0

我的逻辑是一个简单的聚合。但是,聚合键是具有数据类型 54 的“下一个”记录,具有相同的设备 ID。

为了获得下一条记录,我在where子句中使用了相关子查询:

select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
       MAX(case when id = next54 then measure end)
from (select t.*,
             (select MIN(id) from t t2 where t2.id >= t.id and t2.datatype = 54 and t2.device_id = t.device_id) as next54
      from t
     ) t
group by next54

剩下的就是聚合。

因为我个人不是相关子查询的忠实拥护者,所以您也可以使用窗口函数(在 Oracle 中有时称为分析函数)编写此代码:

select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
       MAX(case when id = next54 then measure end)
from (select t.*,
             min(id54) over (partition by device_id order by id desc) as next54
       from (select t.*,
                    (case when datatype = 54 then id end) as id54
             from t
            ) t
     ) t
group by next54

min带有子句的函数order by执行“累积”最小值。结果应该与相关子查询相同。

于 2013-01-02T14:41:22.787 回答
0

可能我在这里大大简化了问题,但是我看不出为什么对于数据类型为 54 的每条记录,您不能只访问数据类型为 19 且度量为 1 的设备的前一条记录:

SELECT  result.ID, 
        result.DeviceID, 
        MAX(start.Timestamp) StartTime, 
        result.Timestamp EndTime, 
        result.Measure
FROM    T result
        INNER JOIN T start
            ON start.DeviceID = result.DeviceID
            AND start.Timestamp < result.Timestamp
            AND start.DataType = 19
            AND start.Measure = 1
WHERE   result.DataType = 54
GROUP BY result.ID, result.DeviceID, result.Timestamp, result.Measure

唯一真正的区别是,我不是从头开始解决问题并朝着结果努力,而是从结果开始,然后从头开始工作。如果进程同时为同一设备运行,这将失败(即一个事务在前一个事务结束之前开始)

于 2013-01-02T15:07:23.690 回答