sql - 仅在值不变时折叠日期记录 - Oracle SQL

Question

我需要您的帮助来解决 Oracle SQL 中的以下问题。

表1（输入）

 Emp_ID Start_Date  End_Date    Rating  Department  Salary
2000    01012011    01012012    A   HR          10000
2000    01012012    01012013    A+  HR          20000
2000    01012013    12319999    A   HR          20000
3000    01012011    01012012    B   Operations  50000
3000    01012012    12319999    B   Operations  60000

表2（输出）

 Emp_ID Start_Date  End_Date    Rating  Department
2000    01012011    12312011    A   HR
2000    01012012    12312012    A+  HR
2000    01012013    12319999    A   HR
3000    01012011    12319999    B   Operations

仅当员工的评级在下一个连续日期范围内相同时才折叠日期记录，并且应该继续直到评级发生变化。

我希望我把我的问题说清楚了..

我查看了其他答案，并认为我需要领导和滞后功能..但是如果有人可以提供有关如何开始的信息，那就太好了..

谢谢

score 1 · Accepted Answer

这似乎有点令人费解，所以我会对改进感兴趣。

select distinct emp_id,
    nvl(x_start_date,
        lag(x_start_date)
            over (partition by emp_id
                order by rn)) as start_date,
    nvl(x_end_date,
        lead(x_end_date)
            over (partition by emp_id
                order by rn nulls first))
                    as end_date,
        rating,
        department
from (
    select emp_id, start_date, end_date, rating, department,
        case start_date
            when lag(end_date)
                over (partition by emp_id, rating, department
                    order by start_date) then null
            else start_date end as x_start_date,
        case end_date
            when lead(start_date)
                over (partition by emp_id, rating, department
                    order by start_date) then null
            else end_date end as x_end_date,
        rownum as rn
    from table1
)
where x_start_date is not null or x_end_date is not null
order by emp_id, start_date
/

有了这个测试数据：

    EMP_ID START_DA END_DATE RA DEPARTMENT               SALARY
---------- -------- -------- -- -------------------- ----------
      2000 01012010 01012011 A  HR                         9000
      2000 01012011 01012012 A  HR                        10000
      2000 01012012 01012013 A+ HR                        20000
      2000 01012013 01012014 A  HR                        20000
      2000 01012014 12319999 A  HR                        21000
      3000 01012011 01012012 B  Operations                50000
      3000 01012012 12319999 B  Operations                60000
      4000 07012011 07012012 B  Operations                50000
      4000 07012012 07012013 B  Operations                50000
      4000 07012013 12319999 B  Operations                60000

我明白了：

    EMP_ID START_DA END_DATE RA DEPARTMENT
---------- -------- -------- -- --------------------
      2000 01012010 01012012 A  HR
      2000 01012012 01012013 A+ HR
      2000 01012013 12319999 A  HR
      3000 01012011 12319999 B  Operations
      4000 07012011 12319999 B  Operations

我还尝试了具有三个连续日期范围的emp_id( 4000) ，并且它处理了 OK - 外部where子句使中间条目基本上消失了。编辑添加：现在也适用于您的附加日期范围2000/A，因为我修复了外部lead/lag分区中的排序。

内部查询将除第一个开始日期和最后一个结束日期之外的所有内容都清除为连续块，外部查询使用第二轮leadandlag将它们合并到相同的行中，distinct然后折叠。

我假设start_dateandend_date是DATE字段，而不是VARCHAR2，并且您已NLS_DATE_FORMAT设置为MMDDYYYY. 如果它们存储为字符串，这是一个坏主意，您需要to_date()在很多地方使排序正常工作。

score 1 · Accepted Answer

select *
from inputtable it1
left join inputtable it2 
       on it1.emp_id = it2.emp_id
      and it1.rating = it2.rating
      and it1.start_date < it2.start_date
      and not exists(select * from inputtable it2a
                     where it1.emp_id = it2a.emp_id
                       and ((it1.rating <> it2a.rating
                         and it1.start_date < it2a.start_date
                         and it2.start_date > it2a.start_date)
                         or (it1.rating = it2a.rating
                         and exists(select * from inputtable it2b
                                    where it2a.emp_id = it2b.emp_id
                                      and it2a.rating = it2b.rating
                                      and it2a.end_date + 1 = it2b.start_date))))
where not exists(select * from inputtable it1a
                 where it1.emp_id = it1a.emp_id
                   and it1.rating = it1a.rating
                   and it1.start_date = it1a.end_date + 1)

sql - 仅在值不变时折叠日期记录 - Oracle SQL

2 回答 2

Related

Reference