2

我有一个如下所示的 Oracle 11.2 表。

MSG_ID JOB_ID STATUS   CREATION_DATE
1      101    SUB      03/30/2011 14:39:52
2      101    SUB      03/30/2011 14:44:37
3      101    RCVD     03/31/2011 14:41:49
4      101    ACCPTD   03/31/2011 14:59:51
5      101    DESIGN   03/31/2011 17:44:37
6      101    REVIEW   03/31/2011 17:44:50
7      101    (null)   04/01/2011 07:51:41
8      101    REJECTED 04/01/2011 07:56:56
9      101    (null)   04/01/2011 16:36:19
10     101    REVIEW   04/01/2011 16:43:49
11     101    APPROVED 04/04/2011 07:55:45
12     101    (null)   04/04/2011 08:14:23
13     101    REJECTED 04/04/2011 14:46:17
14     101    REVIEW   04/04/2011 14:50:50
15     101    (null)   04/04/2011 14:51:27
16     101    REVIEW   04/04/2011 14:57:50
17     101    (null)   04/04/2011 15:34:50
18     101    APPROVED 04/04/2011 15:41:21
19     101    (null)   04/04/2011 15:41:23
20     101    REJECTED 04/04/2011 15:41:24
21     101    REVIEW   04/04/2011 16:07:52
22     101    (null)   04/05/2011 08:50:35
23     101    APPROVED 04/05/2011 10:54:51
24     101    PROD     04/05/2011 11:08:50
25     101    COMPLET  04/06/2011 20:09:49

我正在尝试对相邻(相对于创建日期)消息状态和每个唯一状态组合之间的时间间隔进行分组。可以忽略空状态。结果如下所示。

JOB_ID FROM_STATUS TO_STATUS TIME_INTERVAL
101    SUB         RCVD      sum of time intervals
101    RCVD        ACCPTD    sum of time intervals
101    ACCPTD      DESIGN    sum of time intervals
101    DESIGN      REVIEW    sum of time intervals
101    REVIEW      REJECTED  sum of time intervals
101    REJECTED    REVIEW    sum of time intervals
101    REVIEW      APPROVED  sum of time intervals
101    APPROVED    REJECTED  sum of time intervals
101    APPROVED    PROD      sum of time intervals
101    PROD        COMPLETE  sum of time intervals

有人可以帮助执行此操作所需的 SQL。

先感谢您

4

1 回答 1

3

这是间隙和孤岛问题的变体。第一步是识别连续组,您可以这样

select msg_id, job_id, status, creation_date,
    row_number() over (partition by job_id order by creation_date)
        - row_number() over (partition by job_id, status
            order by creation_date) as chain
from t42
where status is not null

...它为每个具有相同job_idstatus值的块提供一个唯一的“链”号。然后,您可以折叠它们以获得每个“链”的单行,因为您似乎只creation_date对每个连续状态块的最早感兴趣(我认为)。

然后你可以用它lead来查看下一个块并计算出时差,最后——如果我理解了你的最终“总和”要求——对间隔进行分组。

总而言之:

select job_id, from_status, to_status, sum(time_interval) as time_interval
from (
    select job_id, status as from_status, status_start,
        lead(status) over (partition by job_id order by status_start)
            as to_status,
        lead(status_start) over (partition by job_id order by status_start)
            - status_start as time_interval
    from (
        select distinct job_id, status, chain,
            min(creation_date) over (partition by job_id, status, chain
                order by chain) as status_start
        from (
            select msg_id, job_id, status, creation_date,
                row_number() over (partition by job_id order by creation_date)
                    - row_number() over (partition by job_id, status
                        order by creation_date) as chain
            from t42
            where status is not null
        )
    )
)
where to_status is not null
group by job_id, from_status, to_status
order by job_id, min(status_start);

与您一起提供的数据:

    JOB_ID FROM_STA TO_STATU TIME_INTERVAL
---------- -------- -------- -------------
       101 SUB      RCVD        1.00135417
       101 RCVD     ACCPTD      .012523148
       101 ACCPTD   DESIGN      .114421296
       101 DESIGN   REVIEW      .000150463
       101 REVIEW   REJECTED    .591736111
       101 REJECTED REVIEW      .387430556
       101 REVIEW   APPROVED    3.45099537
       101 APPROVED REJECTED    .285127315
       101 APPROVED PROD        .009710648
       101 PROD     COMPLET     1.37568287

10 rows selected.

time_interval是几天之内,但你可以随心所欲地操纵它;既然你提到了间隔,你可以使用 Oracle 间隔:

select job_id, from_status, to_status,
    numtodsinterval(sum(time_interval) , 'DAY') as time_interval
from (
...

    JOB_ID FROM_STA TO_STATU TIME_INTERVAL
---------- -------- -------- -----------------------------
       101 SUB      RCVD     +000000001 00:01:57.000000000
       101 RCVD     ACCPTD   +000000000 00:18:02.000000000
       101 ACCPTD   DESIGN   +000000000 02:44:46.000000000
       101 DESIGN   REVIEW   +000000000 00:00:13.000000000
       101 REVIEW   REJECTED +000000000 14:12:06.000000000
       101 REJECTED REVIEW   +000000000 09:17:54.000000000
       101 REVIEW   APPROVED +000000003 10:49:26.000000000
       101 APPROVED REJECTED +000000000 06:50:35.000000000
       101 APPROVED PROD     +000000000 00:13:59.000000000
       101 PROD     COMPLET  +000000001 09:00:59.000000000

10 rows selected.
于 2013-03-26T22:45:43.430 回答