sql - 讨厌的 SQL 查询：有没有一种方法可以在没有游标的情况下找到分组集中的第一行和最后一行？

Question

我的数据如下所示：

数据样本

我需要做的是，对于具有相同的记录ClientId，我需要对不为空的连续行（使用 CpId）进行分组PlaceId，并在每个组中找到第一行和最后一行，以便我可以DateAdmitted从第一行中检索值和最后一行的DateDischarged值。所以，上面的数据需要像这样组织，然后过滤出我需要的值：

在此处输入图像描述

使用上面的例子，我想要以下基于ClientId：

ClientId    FirstCpIdInSet    DateAdmitted    LastCpIdInSet    DateDischarged
-----------------------------------------------------------------------------
1967        NULL              NULL            NULL             NULL
1983        45                1986-12-29      45               1987-10-09
1983        47                1990-10-01      49               2009-04-12
1983        52                2009-08-31      52               2009-11-30
1988        62                1997-12-15      65               2000-01-07

ClientId1967 可以从结果集中排除，因为它从来没有PlaceId不为空的行。还有几点需要注意：

这是从使用CpIdas 创建的临时表中获取的IDENTITY，并且该表使用 strict 填充ORDER BY，因此CpId按所需顺序是连续的。
PlaceId对于具有单个且连续的那些行，ClientId应该DateAdmitted等于DateDischarged前一行中的。

如果可能的话，我真的很希望能够在没有光标的情况下做到这一点，但是在困惑了两天之后，我就是想不通。这是在 SQL Server 2008 R2 上。

score 2 · Accepted Answer

确实是讨厌的查询。像大多数 SQL 问题一样，它归结为以正确的顺序处理问题的不同方面。我的解决方案不使用游标。它确实使用了外部应用和分区依据。

实现： row_number() over (partition by xx order by yy) 本身不起作用，因为 yy 通常跨越多个 xx 分区。

示例数据：

id  state
1   a
2   a
3   b
4   c

所需范围：

1 <= x < 3
3 <= x < 4
4 <= x

第 1 步 - 使用外部应用查找每一行的下一个状态转换。这使您可以根据所需的任何标准检查每一行的下一个值。此步骤可能会生成比您想要的更多的信息。几行可以转换为相同的值。在此示例中，id 1 和 2 在 id 3 处转换。

伪代码：

select t1.id, t1.state, t3.id, t3.state
from table1 t1
outer apply
(
  select 
    --only grab one row
    top 1 t2.id, t2.state
  from table1 t2 
  where 
    --grab a value that's generated after the current value.
    t1.id < t2.id 
    -- add whatever join logic you need for your case.
    and t1.memberid=t2.memberid 
    -- make sure you get the correct order, typically an identity or time
    order by t2.id asc
) T3

此查询生成如下内容：

id  state id    state
1   a     3     b
2   a     3     b
3   b     4     c
4   c     null  null

我们不想要 id = 2 的行。

第 2 步 - 通过转换列进行分区允许您获得在状态转换发生时始终为 1 的行号值。只需按 1 过滤，您就有了状态转换。

初步结果：

row_number  id  state   id  state
1           1   a       3   b
2           2   a       3   b
1           3   b       4   c
1           4   c      null null

过滤结果：

row_number  id  state   id  state
1           1   a       3   b
1           3   b       4   c
1           4   c      null null

score 1 · Accepted Answer

你没有说你首先和最后的基础是什么。让我假设它是 CPID。您可以使用排名函数来做到这一点：

select ClientID, PlaceId,
       max(CpID) as max(CPId),
       min(case when seqnumasc = 1 then DateAdmitted end) as DateAdmitted,
       max(case when seqnumdesc = 1 then DateDischarged end) as DateDischarged
from (select t.*,
             row_number() over (partition by clientID, placeID order by cpid) as seqnumasc
             row_number() over (partition by clientID, placeID order by cpid desc) as seqnumdesc
      from t
     ) t
where placeID is not null
group by ClientID, placeID

这会放入序列号以确定每个组中的第一行和最后一行。但是，为什么你不能只在添加和释放日期使用 min 和 max 呢？

基于增强信息。. .

现在的问题似乎是根据以下条件定义记录的“集合”：

连续 CPID
同一个客户，同一个公司
地点不为空

如果是这样，以下将为您提供“设置 ID”。这使用了一种技巧来组合连续值，基于从 CPID 中减去一个序列号。这种差异对于连续值是一个常数，提供一个集合 id。

select clientid, setid,
       min(DateAdmitted) as DateAdmitted,
       max(DateDischarged) as DateDischarged,
       min(cpid) as minCPID,
       max(cpid) as maxCPID
from (select clientid, setid, cpid,
             row_number() over (partition by clientid, setid order by cpid) as seqnum,
             count(*) over (partition by clientid, setid) as setsize
      from (select t.*,
                   (cpid - row_number() over (partition by clientid order by cpid)
                   ) as setid
            from t
            where PlaceID is not NULL
           ) t
    ) t
group by clientid, setid

sql - 讨厌的 SQL 查询：有没有一种方法可以在没有游标的情况下找到分组集中的第一行和最后一行？

2 回答 2

Related

Reference