我正在尝试编写一个查询,将重复的客户端记录合并到一条记录中。现在我只是在建立一个映射表来知道我应该用什么映射。
这是我写的两个函数来帮助我
CREATE FUNCTION FindDuplicateClients ()
RETURNS TABLE AS RETURN
(
select distinct CLIENT_GUID
from CLIENTS c
inner join
(
select FIRST_NAME, LAST_NAME, HOME_PHONE
from CLIENTS
group by FIRST_NAME, LAST_NAME, HOME_PHONE
having COUNT(*) > 1
) t on c.FIRST_NAME = t.FIRST_NAME and c.LAST_NAME = t.LAST_NAME and c.HOME_PHONE = t.HOME_PHONE)
go
--Find other clients that map to this client
CREATE FUNCTION FindDuplicateClientsByClient (@Client uniqueidentifier)
RETURNS TABLE AS RETURN
(
select distinct CLIENT_GUID
from CLIENTS c
inner join
(
select x.FIRST_NAME, x.LAST_NAME, x.HOME_PHONE
from CLIENTS x
inner join
(
select FIRST_NAME, LAST_NAME, HOME_PHONE
from CLIENTS
where CLIENT_GUID = @Client
) y on x.FIRST_NAME = y.FIRST_NAME and x.LAST_NAME = y.LAST_NAME and x.HOME_PHONE = y.HOME_PHONE
group by x.FIRST_NAME, x.LAST_NAME, x.HOME_PHONE
having COUNT(*) > 1
) t on c.FIRST_NAME = t.FIRST_NAME and c.LAST_NAME = t.LAST_NAME and c.HOME_PHONE = t.HOME_PHONE
where CLIENT_GUID <> @Client)
go
第一个函数成功返回所有CLIENT_GUID
有超过 1 条记录的,第二个你传入一个 GUID,它返回共享“公共信息”的所有其他 guid(在这种情况下是名字、姓氏和家庭电话)
问题是填写我的映射表。我需要遵循一些规则来优先考虑某些重复项。例如,任何有交易的人都不需要CLIENT_GUID
更改,但他们可以将其他 GUID 合并到他们中(如果其他 GUID 没有交易)
--Create Mapping table
select CLIENT_GUID, CAST(null as uniqueidentifier) as NEW_CLIENT_GUID
into #mapping
from FindDuplicateClients()
--Do not map people who have transactions
update #mapping
set NEW_CLIENT_GUID = CLIENT_GUID
where CLIENT_GUID in (select CLIENT_GUID from trnHistory)
现在这是我遇到麻烦的地方。我不知道如何获取NEW_CLIENT_GUID
在上一个查询中设置的人员列表,FindDuplicateClientsByClient
针对该 GUID 运行,并将NEW_CLIENT_GUID
任何结果设置为NEW_CLIENT_GUID
在不使用游标的情况下输入到函数中的任何结果。
这是我想出的使用光标的方法
declare cur cursor LOCAL FAST_FORWARD for select NEW_CLIENT_GUID from #mapping where NEW_CLIENT_GUID is not null
declare @NEW_CLIENT_GUID uniqueidentifier
open cur
fetch next from cur into @NEW_CLIENT_GUID
while @@fetch_status = 0
begin
update #mapping
set NEW_CLIENT_GUID = @NEW_CLIENT_GUID
where CLIENT_GUID in (select CLIENT_GUID from FindDuplicateClientsByClient(@NEW_CLIENT_GUID)) --Find duplicates to this record
and NEW_CLIENT_GUID is null --Do not reassign values that are already set (ie: duplicates that have transactions)
fetch next from cur into @NEW_CLIENT_GUID
end
close cur
deallocate cur
对我来说,迭代每个#mapping
具有值集的结果对我来说似乎不正确。这样做的正确方法是什么?我正在使用 SQL Server 2008 R2,但我希望它也与 SQL Server 2005 兼容。