sql - 使用 RANK OVER PARTITION 比较前一行结果

Question

我正在使用包含（在其他列中）用户 ID 和 startDate 的数据集。目标是创建一个新列“isRehire”，将他们的 startDate 与之前的 startDates 进行比较。

如果 startDates 之间的差异在 1 年内，则 isRehire = Y。

当用户有超过 2 个 startDates 时，困难和我的问题就出现了。如果第 3 次和第 1 次 startDate 之间的差异超过一年，则第 3 次 startDate 将是重新雇用的新“基准日期”。

用户身份	开始日期	是重新雇用
123	2019 年 7 月 24 日	ñ
123	02/04/20	是的
123	20 年 8 月 25 日	ñ
123	20 年 12 月 20 日	是的
123	21 年 6 月 15 日	是的
123	21 年 8 月 20 日	是的
123	21 年 8 月 30 日	ñ

在上面的示例中，您可以看到问题的可视化。第一个 startDate 07/24/19，用户不是 Rehire。第二个 startDate 02/04/20，他们是 Rehire。第3 次startDate 08/25/20 用户不是重新雇用，因为距离他们最初的 startDate 已经超过 1 年。这是新的“锚”日期。

接下来的 3 个实例都是 Y，因为它们在新的“锚定”日期 08/25/20 的 1 年内。21 年 8 月 30 日的最终开始日期是 2020 年 8 月 25 日过去一年多，表示“N”，“周期”再次重置，21 年 8 月 30 日作为新的“锚”日期。

我的目标是利用 RANK OVER PARTITION 来完成这项工作，因为从我的测试来看，我相信必须有一种方法可以将等级分配给日期，然后可以将其包装在 select 语句中以编写 CASE 表达式。尽管我完全有可能完全找错了树。

您可以在下面看到一些我尝试用来完成此操作的代码，尽管到目前为止还没有取得多大成功。

select TestRank,
startDate,
userID,
CASE WHEN TestRank = TestRank THEN (TestRank - 1
                                            ) ELSE '' END AS TestRank2
from
(

select userID,
startDate
RANK() OVER (PARTITION BY userID
            ORDER BY startDate desc)
            as TestRank
from [MyTable] a
WHERE a.userID = [int]

) b

score 1 · Accepted Answer

这是一个复杂的逻辑，窗口函数是不够的。为了解决这个问题，你需要迭代——或者用 SQL 语言来说，一个递归 CTE：

with t as (
      select t.*, row_number() over (partition by id order by startdate) as seqnum
      from mytable t
     ),
     cte as (
      select t.id, t.startdate, t.seqnum, 'N' as isrehire, t.startdate as anchordate
      from t
      where seqnum = 1
      union all
      select t.id, t.startdate, t.seqnum,
             (case when t.startdate > dateadd(year, 1, cte.anchordate) then 'N' else 'Y' end),
             (case when t.startdate > dateadd(year, 1, cte.anchordate) then t.startdate else cte.anchordate end)
      from cte join
           t
           on t.seqnum = cte.seqnum + 1
     )
select *
from cte
order by id, startdate;

这是一个 db<>fiddle。

sql - 使用 RANK OVER PARTITION 比较前一行结果

1 回答 1

Related

Reference