2

我正在尝试解决与以下示例非常相似的工作趋势问题。我想我有一个方法,但不知道如何在 SQL 中做到这一点。

输入数据为:

MTD         LOC_ID  RAINED
1-Apr-16    1       Y
1-Apr-16    2       N
1-May-16    1       N
1-May-16    2       N
1-Jun-16    1       N
1-Jun-16    2       N
1-Jul-16    1       Y
1-Jul-16    2       N
1-Aug-16    1       N
1-Aug-16    2       Y

所需的输出是:

MTD         LOC_ID  RAINED  TRENDS
1-Apr-16    1       Y       New
1-May-16    1       N       No Rain
1-Jun-16    1       N       No Rain
1-Jul-16    1       Y       Carryover
1-Aug-16    1       N       No Rain
1-Apr-16    2       N       No Rain
1-May-16    2       N       No Rain
1-Jun-16    2       N       No Rain
1-Jul-16    2       N       No Rain
1-Aug-16    2       Y       New

我试图通过不依赖于 MTD 的趋势从输入中产生输出。这样,当新的月份被添加到输入中时,输出会改变而无需编辑查询。

TRENDS 的逻辑将出现在每个唯一的 LOC_ID 上。趋势将具有三个值:第一个月 RAINED 为“Y”的“新”,RAINED 为“Y”的任何后续月份的“Carryover”,以及 RAINED 为“N”的任何月份的“No Rain”。

我想通过引入一个带有 listagg 的中间步骤来自动化这个问题。例如,对于 LOC_ID = "1":

MTD         LOC_ID  RAINED  PREV_RAINED
1-Apr-16    1       Y       (null) / 0 / (I don't care)
1-May-16    1       N       Y
1-Jun-16    1       N       Y;N
1-Jul-16    1       Y       Y;N;N
1-Aug-16    1       N       Y;N;N;Y

这样,要在输出中产生“趋势”,我可以说:

case when RAINED = 'Y' then
    case when not regexp_like(PREV_RAINED, 'Y', 'i') then
        'New'
    else
        'Carryover'
    end
else
    'No Rain'
end as TRENDS

我的问题是我不确定如何为每个唯一的 LOC_ID 生成 PREV_RAINED。我有一种感觉,它需要结合 LAG() 语句并按 MTD 按 LOC_ID 顺序进行分区,但我需要做的滞后数取决于每个月。

是否有一种简单的方法来生成 PREV_RAINED 或更简单的方法来解决我的整体问题,同时保持每个月的自动化?

感谢您阅读所有这些!:)

4

3 回答 3

1

这是旧版本的解决方案。WITH 子句用于输入数据;解决方案在 WITH 子句之后立即开始。

接下来我将研究 MATCH_RECOGNIZE 解决方案,我可能会将其添加到此答案中。

with
     input_data ( mtd, loc_id, rained ) as (
       select to_date('1-Apr-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
       select to_date('1-Apr-16', 'dd-Mon-rr'), 2, 'N' from dual union all
       select to_date('1-May-16', 'dd-Mon-rr'), 1, 'N' from dual union all
       select to_date('1-May-16', 'dd-Mon-rr'), 2, 'N' from dual union all
       select to_date('1-Jun-16', 'dd-Mon-rr'), 1, 'N' from dual union all
       select to_date('1-Jun-16', 'dd-Mon-rr'), 2, 'N' from dual union all
       select to_date('1-Jul-16', 'dd-Mon-rr'), 1, 'Y' from dual union all
       select to_date('1-Jul-16', 'dd-Mon-rr'), 2, 'N' from dual union all
       select to_date('1-Aug-16', 'dd-Mon-rr'), 1, 'N' from dual union all
       select to_date('1-Aug-16', 'dd-Mon-rr'), 2, 'Y' from dual
     )
select mtd, loc_id, rained,
       case rained when 'N' then 'No Rain'
                   else case when rn = 1 then 'New' 
                                         else 'Carryover' end
                   end  as trends
from ( select mtd, loc_id, rained, 
              row_number() over (partition by loc_id, rained order by mtd) rn
       from   input_data
)
order by loc_id, mtd
;

输出

MTD                     LOC_ID RAINED TRENDS  
------------------- ---------- ------ ---------
01/04/2016 00:00:00          1      Y New      
01/05/2016 00:00:00          1      N No Rain  
01/06/2016 00:00:00          1      N No Rain  
01/07/2016 00:00:00          1      Y Carryover
01/08/2016 00:00:00          1      N No Rain  
01/04/2016 00:00:00          2      N No Rain  
01/05/2016 00:00:00          2      N No Rain  
01/06/2016 00:00:00          2      N No Rain  
01/07/2016 00:00:00          2      N No Rain  
01/08/2016 00:00:00          2      Y New      

 10 rows selected
于 2016-09-28T19:01:20.447 回答
1

在下面的 SQL 中有两个部分。

(i) Calculating the ROWNUMBER value for rained attribute at loc_id,rained level.
(ii) Get the count at partition level loc_id,rained.

通过计算上述两个,我们可以编写 CASE WHEN 逻辑来根据您的要求计算趋势。

SELECT mtd,
       loc_id,
       rained,
       CASE WHEN rained = 'N' THEN 'No Rain'
            WHEN rained = 'Y' AND rn = 1 THEN 'New'
            ELSE 'Carry Over'    
        END AS Trends       
  FROM
        ( 
            SELECT mtd,
                   loc_id,
                   rained,                   
                   ROW_NUMBER() OVER ( PARTITION BY loc_id,rained ORDER BY mtd ) AS rn,
                   COUNT(*) OVER ( PARTITION BY loc_id,rained ) AS count_locid_rained               
              FROM INPUT
              ORDER BY loc_id,mtd,rained,rn
         ) X;
于 2016-09-28T18:57:17.870 回答
1

使用 MATCH_RECOGNIZE 的解决方案(仅适用于 Oracle 12c)。在您的数据集上测试不同的解决方案;有人告诉我 MATCH_RECOGNIZE 可能比其他解决方案快得多,但这取决于许多因素。

select loc_id, mtd, rained, trends
from input_data
  match_recognize (
    partition by loc_id, rained
    order by     mtd
    measures     mtd as mtd,
                 case when rained = 'N' then 'No Rain'
                      else case when match_number() = 1 then 'New' else 'Carryover' end
                      end as trends
    pattern (a)
    define a as 0 = 0
  )
order by loc_id, mtd;
于 2016-09-28T21:46:50.743 回答