0

我有 2 个表 'tbMemeberInfo' 和 'tbDocument'。

上传文档时,它会在 tbDocument 中记录一个字段 InfoID,即 tbMemeberInfo 中的 PK(MemberInfoID)。

然而,在 tbMemeberInfo 中有重复项。用户有一个“AgreementNo”和一个“IDNumber”,重复记录只包含 1 或其他。

我需要合并这些记录以将“AgreementNo”和“IDNumber”插入其中一条记录并删除另一条记录。

这是迄今为止我尝试过的表结构和代码......

tb会员信息

MemberInfoID   AgreementNo  IDNumber     DOB                  Initials  FirstName   LastName   Language      Role    CountryID     Email
861616         124346665    NULL         1976-08-24 00:00:00.000    DV   DAMIAN     Example     English      Member         1     damian.example@mail.com
866185          NULL      7608241234123  1976-08-24 00:00:00.000    DV   DAMIAN     Example     English      Member         1     damian.example@mail.com

tb文档

DocumentID  r_object_id      DocumentTypeID   UniqueDocumentNo  ContentLength   ContentType        FileName                                      CreatedUserID     CreatedDate                 InfoID   
293787      0900d431800bc987    13            PPS156329L         1753819        application/pdf    Example_DV_PROV_APP_2009110316140300[1].pdf     362            2010-01-13 16:21:46.250       861616
293794      0900d431800bc998    530           PPS156335O         66750          image/tiff         Example, DV DRS REPORT.tif                      362            2010-01-13 16:26:48.420       861616

SQL 代码

DECLARE
@MemberInfoID int
,@AgreementNo varchar(50)
,@IDNumber varchar
,@DOB datetime
,@TitleID int
,@FirstName varchar(150)
,@LastName varchar(150)
,@ModifiedDate datetime 

SELECT @AgreementNo = AgreementNo, @IDNumber = IDNumber, @FirstName = FirstName, @LastName = LastName, @DOB = DOB
FROM tbMemberInfo mi
INNER JOIN tbDocument d
ON mi.MemberInfoID = d.InfoID
WHERE (mi.AgreementNo = '') OR (mi.IDNumber = '') 

--SELECT @IDNumber = IDNumber  From tbMemberInfo mi
--INNER JOIN tbDocument d
--ON mi.MemberInfoID = d.InfoID
--WHERE (mi.AgreementNo = '') 

--SELECT @AgreementNo = AgreementNo From tbMemberInfo mi
--INNER JOIN tbDocument d
--ON mi.MemberInfoID = d.InfoID
--WHERE (mi.IDNumber  = '') AND (FirstName = @FirstName) AND (LastName = @LastName) AND (DOB = @DOB)

UPDATE tbMemberInfo
SET [IDNumber] = @IDNumber, [AgreementNo] = @AgreementNo, ModifiedDate  = GETDATE() 
FROM tbMemberInfo mi
    INNER JOIN tbDocument d
        ON mi.MemberInfoID = d.InfoID   
 WHERE (IDNumber = '') OR (AgreementNo = '') AND (FirstName = @FirstName)
        AND (LastName = @LastName) AND (DOB = @DOB)
 GROUP BY MemberInfoID

这些都不起作用。它将“7”放入所有 IDNumber 列。关于如何做到这一点的任何想法。我还没来得及删除重复项,我想先合并。

4

1 回答 1

2

您应该使用IDNumber is NULL而不是= ''

您的 where 子句有一些问题,您应该编写如下内容:

WHERE (IDNumber is Null or AgreementNo is NULL) AND (FirstName = @FirstName AND LastName = @LastName AND DOB = @DOB)

我的建议是创建一个与tbMemberInfo具有完全相同架构的 tmp 表(这将帮助您组织步骤并稍微简化任务);以您想要的方式对条目进行分组并将它们合并到 tmp 表中(如果我理解正确,您正在使用FirstName、LastName 和 DateOfBirth来识别不同的条目)。最后截断tbMemberInfo中的所有内容并用tmp表中的内容填充它。这里有更多细节:

INSERT INTO tmp
SELECT * FROM tbMemberInfo;

UPDATE tmp t1 INNER JOIN tbMemberInfo t2
        ON    t1.FirstName = t2.FirstName 
        AND   t1.LastName = t2.LastName 
        AND   t1.DOB = t2.DOB 
        SET   t1.AgreementNo = t2.AgreementNo 
        WHERE t1.AgreementNo IS NULL 
        AND   t1.MemberInfoID != t2.MemberInfoID;

UPDATE tmp t1 INNER JOIN tbMemberInfo t2
        ON    t1.FirstName = t2.FirstName 
        AND   t1.LastName = t2.LastName 
        AND   t1.DOB = t2.DOB 
        SET   t1.IDNumber = t2.IDNumber 
        WHERE t1.IDNumber is NULL 
        AND   t1.MemberInfoID != t2.MemberInfoID;

-- Just to make sure there are no entries left without a value
SELECT * FROM tmp WHERE IDNumber is NULL OR AgreementNo is NULL;

-- now we are going to keep the row with the lowest ID
DELETE t1 FROM tmp t1, tmp t2 
    WHERE t1.MemberInfoID < t2.MemberInfoID 
    AND   t1.FirstName = t2.FirstName 
    AND   t1.LastName = t2.LastName 
    AND   t1.DOB = t2.DOB;

-- update the InfoID intbDocument table
UPDATE tbDocument t1 JOIN tbMemberInfo t2 
           ON    t1.InfoID = t2.MemberInfoID
           JOIN  tmp t2 
           ON    t2.FirstName = t3.FirstName 
           AND   t2.LastName = t3.LastName 
           AND   t2.DOB = t3.DOB
               SET   t1.InfoID = t3.tbMemberInfoID
               WHERE t1.InfoID NOT IN (SELECT MemberInfoID FROM tmp);

TRUNCATE tbMemberInfo;

INSERT INTO tbMemberInfo
SELECT * FROM tmp;

注意:我没有测试这些查询,只是很快就写出来了,所以它们可能有问题,但你会明白的,可以自己修复它们。不要在原始数据上运行这些,先复制并测试它们。

于 2012-08-21T17:59:20.977 回答