sql - 忽略空值并为作为更大查询一部分的列显示单个值的最聪明的方法是什么？

Question

我正在使用一个看起来像这样的表：

|   |  Name | CaseID | UsrID | DL_NO |   SSN   | Address     | DateSeen   |
|---|:-----:|:------:|:-----:|:-----:|:-------:|-------------|------------|
| 1 | Smith |  AB190 | 88885 |       | 1234567 | 222 Side Rd | 01/01/2020 |
| 2 | Smith |  AB186 | 88885 | B0938 |         |             | 10/01/2019 |
| 3 | Smith |  AB170 | 88885 |       | 1234567 | 123 Side Rd | 09/01/2019 |
| 4 | Smith |  AB168 | 88885 | B0938 |         | 123 Road St | 03/05/2019 |
| 5 | Smith |  AB132 | 88885 | B0938 | 1234567 |             | 03/01/2019 |
| 6 | Smith |  AB102 | 88885 | B0938 | 1234567 | 123 Road St | 02/01/2019 |

我无法理解如何正确包含更新/偶尔丢失位的数据。

我想看到的是每列最新的非空值：

|   |  Name | NumOfCases | UsrID | DL_NO |   SSN   | Address     |
|---|:-----:|:----------:|:-----:|:-----:|:-------:|-------------|
| 1 | Smith |      6     | 88885 | B0938 | 1234567 | 222 Side Rd |

我正在使用这个：

SELECT TOP 50 Name, UsrID, COUNT(DISTINCT CaseID) as NumofCases
FROM People
WHERE DateSeen between 01/31/2019 and 10/02/2019
GROUP BY Name, UsrID
ORDER BY DateSeen desc

退货

|   |  Name | UsrID | NumofCases |
|---|:-----:|-------|:----------:|
| 1 | Smith | 88885 |      6     |

在我意识到我对其他领域有用之前，这一切正常。

当我尝试对附加列进行类似查询时，强制 GROUP BY 显然会破坏“NumofCases”计数。

即使是我“SELECT TOP 1”的 CROSS APPLY 也需要分组依据。

有什么想法吗？

score 0 · Accepted Answer

您可以尝试以下查询 -

SELECT Name
      ,COUNT(DISTINCT CaseID) OVER(PARTITION BY Name)
      ,UsrID
      ,MAX(DL_NO) DL_NO
      ,MAX(SSN) SSN
      ,MAX(Address) Address
      ,MAX(DateSeen) DateSeen
FROM People
WHERE DateSeen BETWEEN 01/31/2019 AND 10/02/2019
GROUP BY Name, UsrID
ORDER BY DateSeen desc

score 0 · Accepted Answer

这是可以帮助的：

SELECT TOP 50 Name, UsrID, COUNT(DISTINCT CaseID) as NumofCases,
(select top 1 b.DL_NO FROM People b where a.UsrID = b.UsrID and ltrim(rtrim(b.DL_NO)) <> '' and b.DL_NO is not null order by b.DateSeen desc) as DL_NO,
(select top 1 b.SSN  FROM People b where a.UsrID = b.UsrID and ltrim(rtrim(b.SSN)) <> '' and b.SSN is not null order by b.DateSeen desc) as SSN,
(select top 1 b.Address FROM People b where a.UsrID = b.UsrID and ltrim(rtrim(b.Address)) <> '' and b.Address is not null order by b.DateSeen desc) as Address,
FROM People a
WHERE DateSeen between '01/31/2019' and '10/02/2019'
GROUP BY Name, UsrID

score 0 · Accepted Answer

如果您控制数据模型，规范化您的表将使问题变得更简单。它还可以防止数据不一致，例如示例数据中的地址不一致

create table People ( name, usrid, dl_no,ssn, address); --add data types
create table Case (usrid, dateseen, caseid);--add data types

create case_view as
select name,usrid,dl_no,ssn,address,dateseen,caseid
from people p join case c on p.usrid=c.usrid;

那么你的查询是微不足道的

select name,usrid,dl_no,ssn,address,COUNT(DISTINCT CaseID) as NumofCases
from case_view
group by name,usrid,dl_no,ssn,address

您可以添加日期或计数过滤器。

sql - 忽略空值并为作为更大查询一部分的列显示单个值的最聪明的方法是什么？

3 回答 3

Related

Reference