假设(基于您提到 SSIS 的事实)您可以OUTER APPLY
用来获取上一行:
DECLARE @T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT @T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, 'A', 'H', '20121205');
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prev.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prev.Att2, '')
FROM @T T
OUTER APPLY
( SELECT TOP 1 Att1, Att2
FROM @T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
ORDER BY ValidFrom DESC
) prev
ORDER BY T.ValidFrom;
(我必须为 ValidFrom 添加随机值以确保 order by 正确)
编辑
如果您有多个具有空白值的连续行,则上述方法将不起作用 - 例如
DECLARE @T TABLE (SubID INT, Att1 CHAR(1), Att2 CHAR(2), ValidFrom DATETIME);
INSERT @T VALUES
(1, 'J', '', '20121201'),
(1, '', 'l', '20121202'),
(1, 'B', '', '20121203'),
(1, '', 'H', '20121204'),
(1, '', 'J', '20121205'),
(1, 'A', 'H', '20121206');
如果这可能发生,您将需要两个OUTER APPLY
:
SELECT T.SubID,
Att1 = COALESCE(NULLIF(T.att1, ''), prevAtt1.Att1, ''),
Att2 = COALESCE(NULLIF(T.att2, ''), prevAtt2.Att2, '')
FROM @T T
OUTER APPLY
( SELECT TOP 1 Att1
FROM @T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt1
OUTER APPLY
( SELECT TOP 1 Att2
FROM @T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
) prevAtt2
ORDER BY T.ValidFrom;
但是,由于每个 OUTER APPLY 只返回一个值,我会将其更改为相关子查询,因为上面将评估PrevAtt1.Att1
每一行的“PrevAtt2.Att2”,无论是否需要。但是,如果您将其更改为:
SELECT T.SubID,
Att1 = COALESCE(
NULLIF(T.att1, ''),
( SELECT TOP 1 Att1
FROM @T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att1 , '') != ''
ORDER BY ValidFrom DESC
), ''),
Att2 = COALESCE(
NULLIF(T.att2, ''),
( SELECT TOP 1 Att2
FROM @T prev
WHERE prev.SubID = T.SubID
AND prev.ValidFrom < t.ValidFrom
AND COALESCE(prev.Att2 , '') != ''
ORDER BY ValidFrom DESC
), '')
FROM @T T
ORDER BY T.ValidFrom;
子查询仅在需要时(即 Att1 或 Att2 为空白时)而不是对每一行进行评估。执行计划没有显示这一点,实际上后者的“实际执行计划”看起来更密集,几乎可以肯定不会。但与往常一样,关键是测试,在您的数据上运行并查看哪个性能最好,并检查读取的 IO 统计信息等。