我在 SQL 表中有以下数据:
我需要查询数据,以便获得每个员工缺少的“ familyid ”列表。
例如,对于序列中缺少的 Employee 1021,我应该得到 ID:2 和 5,对于 Employee 1027,应该得到缺少的数字 1 和 6。
关于如何查询的任何线索?
感谢任何帮助。
我在 SQL 表中有以下数据:
我需要查询数据,以便获得每个员工缺少的“ familyid ”列表。
例如,对于序列中缺少的 Employee 1021,我应该得到 ID:2 和 5,对于 Employee 1027,应该得到缺少的数字 1 和 6。
关于如何查询的任何线索?
感谢任何帮助。
找到第一个缺失值
我会使用ROW_NUMBER
窗口函数来分配“正确的”序列 ID 号。假设每次员工 ID 更改时序列 ID 都会重新启动:
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
然后,我将过滤结果集以仅包含序列 ID 不匹配的行:
SELECT *
FROM (
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
WHERE a.familyid <> a.sequenceid
再说一次,您应该轻松地分组employee_number
并找到每个员工的第一个缺失的序列 ID:
SELECT
a.employee_number,
MIN(a.sequence_id) AS first_missing
FROM (
SELECT
e.id,
e.name,
e.employee_number,
e.relation,
e.familyid,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
WHERE a.familyid <> a.sequenceid
GROUP BY a.employee_number
查找所有缺失值
扩展之前的查询,我们可以在每次familyid
和之间的差异sequenceid
发生变化时检测到一个缺失值:
-- Warning: this is totally untested :-/
SELECT
b.employee_number,
MIN(b.sequence_id) AS missing
FROM (
SELECT
a.*,
a.familyid - a.sequenceid AS displacement
SELECT
e.*,
ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid
FROM employee_members e
) a
) b
WHERE b.displacement <> 0
GROUP BY
b.employee_number,
b.displacement
这是一种方法。计算每个员工的最大家庭 ID。然后将其加入到最大家庭 ID 的数字列表中。结果对于每个员工和预期的家庭 ID 都有一行。
做一个left outer join
从这个回到原来的数据,就familyid
和个数。在没有匹配项的地方,这些是缺失值:
with nums as (
select 1 as n
union all
select n+1
from nums
where n < 20
)
select en.employee, n.n as MissingFamilyId
from (select employee, min(familyid) as minfi, max(familyid) as maxfi
from t
group by employee
) en join
nums n
on n.n <= maxfi left outer join
t
on t.employee = en.employee and
t.familyid = n.n
where t.employee_number is null;
familyid
请注意,当缺少序列中的最后一个数字时,这将不起作用。但这可能是您可以使用数据结构做的最好的事情。
此外,上述查询假设最多有 20 个家庭成员。
这将起作用,您选择所有“依赖项”并在前一行留下连接。如果该行不存在,那么您将显示结果:
SELECT 'Missing Prior', t1.*
FROM employee_members t1
LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number
AND (t1.familyid-1) = t2.familyid
WHERE t2.employee_number is null and t1.relation == 'Dependent'
另一个向您显示丢失号码的版本:
SELECT t1.employee_number, t1.familyid-1 as Missing_Member
FROM employee_members t1
LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number
AND (t1.familyid-1) = t2.familyid
WHERE t2.employee_number is null and t1.relation == 'Dependent'
另一种解决方案:建立一个包含序列中所有可能值的表(可以为此使用身份)。然后在源表为空的表上左连接。
DECLARE @Seq TABLE (id INT IDENTITY(1, 1))
DECLARE @iter INT = 1
WHILE @iter <= (
SELECT MAX([your ID column])
FROM [Offending Table]
)
BEGIN
INSERT @Seq DEFAULT
VALUES
SET @iter = @iter + 1
END
SELECT id
FROM @seq s
LEFT JOIN [Offending Table] ot ON s.id = ot.[your ID column]
WHERE ot.[your ID column] IS NULL
此选择将使用 CTE 方法检索每个员工缺少的“familyid”列表。
询问 :
WITH emp_grp (
EmployeeID
,MaxFamilyID
)
AS (
SELECT e2.EmployeeID
,MAX(e2.FamilyID) MaxFamilyID
FROM employee_number e2
GROUP BY e2.EmployeeID
)
,emp_mem
AS (
SELECT EmployeeID
,0 AS FamilyID
,MaxFamilyID
FROM emp_grp
UNION ALL
SELECT EmployeeID
,FamilyID + 1 AS FamilyID
,MaxFamilyID
FROM emp_mem
WHERE emp_mem.FamilyID < MaxFamilyID
)
SELECT emp_mem.EmployeeID
,emp_mem.FamilyID
FROM emp_mem
LEFT JOIN employee_number emp_num ON emp_mem.EmployeeID = emp_num.EmployeeID
AND emp_mem.FamilyID = emp_num.FamilyID
WHERE emp_num.EmployeeID IS NULL
ORDER BY emp_mem.EmployeeID
,emp_mem.FamilyID
OPTION ( MAXRECURSION 32767)
输出 :
EmployeeID FamilyID
----------- -----------
1021 2
1021 5
1027 1
1027 6