2

我有一个查询,它返回以下内容,除了最后一列,这是我需要弄清楚如何创建的。对于每个给定ObservationID的,我需要返回状态更改的日期;类似于 LEAD() 函数,它需要条件而不仅仅是偏移量。可以做到吗?

我需要计算列更改日期;它应该是状态不是当前状态的最后日期。

+---------------+--------+-----------+--------+-------------+
| ObservationID | Region |   Date    | Status | Change Date | <-This field
+---------------+--------+-----------+--------+-------------+
|             1 |     10 | 1/3/2012  | Ice    | 1/4/2012    |
|             2 |     10 | 1/4/2012  | Water  | 1/6/2012    |
|             3 |     10 | 1/5/2012  | Water  | 1/6/2012    |
|             4 |     10 | 1/6/2012  | Gas    | 1/7/2012    |
|             5 |     10 | 1/7/2012  | Ice    |             |
|             6 |     20 | 2/6/2012  | Water  | 2/10/2012   |
|             7 |     20 | 2/7/2012  | Water  | 2/10/2012   |
|             8 |     20 | 2/8/2012  | Water  | 2/10/2012   |
|             9 |     20 | 2/9/2012  | Water  | 2/10/2012   |
|            10 |     20 | 2/10/2012 | Ice    |             |
+---------------+--------+-----------+--------+-------------+
4

3 回答 3

1

示范条款 (10g+) 可以以紧凑的方式做到这一点:

SQL> create table observation(ObservationID ,  Region  ,obs_date,  Status)
  2  as
  3  select  1, 10, date '2012-03-01', 'Ice' from dual union all
  4  select  2, 10, date '2012-04-01', 'Water' from dual union all
  5  select  3, 10, date '2012-05-01', 'Water' from dual union all
  6  select  4, 10, date '2012-06-01', 'Gas' from dual union all
  7  select  5, 10, date '2012-07-01', 'Ice' from dual union all
  8  select  6, 20, date '2012-06-02', 'Water' from dual union all
  9  select  7, 20, date '2012-07-02', 'Water' from dual union all
 10  select  8, 20, date '2012-08-02', 'Water' from dual union all
 11  select  9, 20, date '2012-09-02', 'Water' from dual union all
 12  select 10, 20, date '2012-10-02', 'Ice' from dual ;

Table created.

SQL> select ObservationID, obs_date, Status, status_change
  2            from observation
  3          model
  4          dimension by (Region, obs_date, Status)
  5          measures ( ObservationID, obs_date obs_date2, cast(null as date) status_change)
  6          rules (
  7            status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]
  8          )
  9   order by 1;

OBSERVATIONID OBS_DATE  STATU STATUS_CH
------------- --------- ----- ---------
            1 01-MAR-12 Ice   01-APR-12
            2 01-APR-12 Water 01-JUN-12
            3 01-MAY-12 Water 01-JUN-12
            4 01-JUN-12 Gas   01-JUL-12
            5 01-JUL-12 Ice
            6 02-JUN-12 Water 02-OCT-12
            7 02-JUL-12 Water 02-OCT-12
            8 02-AUG-12 Water 02-OCT-12
            9 02-SEP-12 Water 02-OCT-12
           10 02-OCT-12 Ice

小提琴:http ://sqlfiddle.com/#!4/f6687/1

即,我们将在区域、日期和状态上进行标注,因为我们希望查看具有相同区域的单元格,但要获取状态不同的第一个日期。

我们还必须测量日期,所以我创建了一个别名obs_date2来做到这一点,我们想要一个新列status_change来保存状态更改的日期。

这条线是为我们完成所有工作的线:

status_change[any,any,any] = min(obs_date2)[cv(Region), obs_date > cv(obs_date), status != cv(status)]

它说,对于我们的三个维度,只查看具有相同区域的行 ( cv(Region),) 并查看日期在当前行日期之后的行 ( obs_date > cv(obs_date)) 并且状态与当前行不同 ( status != cv(status))满足这组条件的最小日期 ( min(obs_date2)) 并将其分配给status_change. any,any,any左侧部分表示此计算适用于所有行。

于 2013-03-26T23:24:27.350 回答
1

我已经尝试了很多次来理解 MODEL 子句,但从来没有真正管理过它,所以我想我会添加另一个解决方案

该解决方案采用了 Ronnis 所做的一些工作,但使用IGNORE NULLS了 LEAD 函数的子句。FIRST_VALUE我认为这只是 Oracle 11 的新功能,但如有必要,您可以将其替换为 Oracle 10 的功能。

select
  observation_id,
  region,
  observation_date,
  status,
  lead(case when is_change = 'Y' then observation_date end) ignore nulls 
    over (partition by region order by observation_date) as change_observation_date
from (
  select
    a.observation_id,
    a.region,
    a.observation_date,
    a.status,
    case 
      when status = lag(status) over (partition by region order by observation_date) 
        then null
        else 'Y' end as is_change
       from observations a
)
order by 1
于 2013-03-27T07:23:10.750 回答
0

在清理重叠的起始/截止日期和重复行时,我经常这样做。不过,您的情况要简单得多,因为您只有“起始日期”:)

设置测试数据

create table observations(
   observation_id   number       not null
  ,region           number       not null
  ,observation_date date         not null
  ,status           varchar2(10) not null
);


insert 
  into observations(observation_id, region, observation_date, status)
   select 1,  10, date '2012-03-01', 'Ice'   from dual union all
   select 2,  10, date '2012-04-01', 'Water' from dual union all
   select 3,  10, date '2012-05-01', 'Water' from dual union all
   select 4,  10, date '2012-06-01', 'Gas'   from dual union all
   select 5,  10, date '2012-07-01', 'Ice'   from dual union all
   select 6,  20, date '2012-06-02', 'Water' from dual union all
   select 7,  20, date '2012-07-02', 'Water' from dual union all
   select 8,  20, date '2012-08-02', 'Water' from dual union all
   select 9,  20, date '2012-09-02', 'Water' from dual union all
   select 10, 20, date '2012-10-02', 'Ice'   from dual;

commit;

以下查询具有三个兴趣点:

  1. 识别重复信息(录音显示与之前的录音相同)
  2. 忽略重复的录音
  3. 从“下一次”更改中确定日期

.

with lagged as(
   select a.*
         ,case when status = lag(status, 1) over(partition by region 
                                                     order by observation_date) 
               then null 
               else rownum 
           end as change_flag -- 1
     from observations a
)
select observation_id
      ,region
      ,observation_date
      ,status
      ,lead(observation_date, 1) over(
         partition by region 
             order by observation_date
      ) as change_date --3
      ,lead(observation_date, 1, sysdate) over(
         partition by region 
             order by observation_date
      ) - observation_date as duration
  from lagged
 where change_flag is not null -- 2
 ;
于 2013-03-26T22:55:20.343 回答