过滤掉不相关的数据
对于任何复杂的查询,艺术的一部分是逐个构建查询,随时进行测试。
我假设表名是 PatientMovements 并且:
给定像 ID = {6,7} 和 ID = {8,9} 这样的行对,可以正确地说,当有空出院日期的患者(帐号)、单位和入院日期所在的行被忽略时也是同一患者、单位和入院日期但非空出院日期的记录。
因此,第一步是生成我们需要处理的行,从记录在数据库中的表中过滤掉不相关的数据。这是两组数据的 UNION:
- 那些具有非空出院日期的行。
- 那些出院日期为空但没有相同帐户、单位和入院日期的行。
显然,UNION 的第一部分是:
SELECT * FROM PatientMovements WHERE DischargeDate IS NOT NULL
不太明显的是,UNION 的第二部分是:
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
现在您可以将它们组合成一个结果集:
SELECT *
FROM PatientMovements
WHERE DischargeDate IS NOT NULL
UNION
SELECT *
FROM PatientMovements AS p1
WHERE DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
您可以通过检查它是否返回 ID 为 1..5、7 和 9 的行来验证上面的查询。
警告:未经测试的代码。此答案中的所有 SQL 都没有靠近 DBMS,因此未经测试。
应用以前的经验教训
然后你可以应用你从另一个问题中学到的知识来排序数据并计算日期差异等。唯一的复杂之处是你必须写出两次查询,这很痛苦(除非 MS Access 2003 支持'WITH'子句或公用表表达式)。
但是没有单一的查询来获得这个所需的输出吗?
当然,UNION 是一个单一的查询。我想你可以写:
SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
我无法立即想到一种更紧凑的查询方式。
将 UNION 构建为“另一个答案”
另一个问题的公认答案有两种可能的解决方案(经评论修改并重新格式化):
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
或者:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
如评论中所述,问题中缺少表名会导致答案中出现不同的表名;在这个答案中,我称之为 PatientMovements 的东西被称为 YourTable。另一个区别是原始问题没有在数据中包含 Unit 或 DischargeDate 列。但是,我给出的 UNION 查询给出了运行这些查询的相关数据,所以剩下要做的就是将 UNION 查询写入其他答案中,而不是 YourTable。这将导致:
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
JOIN (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
ON T1.AccountNumber = T2.Accountnumber AND T2.Date > T1.Date
或者:
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T2
WHERE T2.Accountnumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM (SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
) AS T1
) AS T
所以,只要你小心,在片段中开发查询,然后一致地组合它们,就可以驯服看起来最糟糕的查询。
公用表表达式
请注意,SQL 标准有“通用表表达式”(CTE),也就是“WITH 子句”,它可以让事情变得更容易。
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT T1.ID, T1.AccountNumber, T1.Date,
MIN(T2.Date) AS NextDate,
DATEDIFF("D", T1.Date, MIN(T2.Date)) AS DaysDiff
FROM YourTable T1
JOIN YourTable T2
ON T1.AccountNumber = T2.AccountNumber AND T2.Date > T1.Date
或者:
WITH YourTable AS
(SELECT *
FROM PatientMovements
WHERE (DischargeDate IS NOT NULL)
OR (DischargeDate IS NULL
AND NOT EXISTS
(SELECT *
FROM PatientMovements AS P2
WHERE P1.Account = P2.Account
AND P1.Unit = P2.Unit
AND P1.AdmitDate = P2.AdmitDate
AND P2.DischargeDate IS NOT NULL
)
)
)
SELECT ID, AccountNumber, Date, NextDate,
DATEDIFF("D", Date, NextDate) AS DaysDiff
FROM (SELECT ID, AccountNumber, Date,
(SELECT MIN(Date)
FROM YourTable T2
WHERE T2.AccountNumber = T1.AccountNumber
AND T2.Date > T1.Date
) AS NextDate
FROM YourTable T1
) AS T
使用 CTE 的主要优点之一是优化器被明确告知表表达式在所有使用它的地方都是相同的,而当它被多次写出时,它可能不会发现这种共性。另外,多次写出查询会导致两个“本来相同”的查询实际上由于编辑错误而略有不同。CTE 排除了这种可能性。当前上下文中的另一个优势是将 CTE 与另一个问题的解决方案结合起来是儿戏。
遗憾的是,MS Access 2003 不太可能支持 CTE。我分担你的痛苦;我主要使用的 DBMS 也没有。