1

我有一个名为 customers 的表,它从表单中获取信息,并非所有字段都是必需的(这是因为表单会使用输入的信息生成 asf / xml),我希望能够将重复项合并到一行中,然后删除重复项.

这是我的桌子

CID | LastName | FirstName | Street | City | ZipCode | HomePhone | CellPhone | EmailAddr
 1     Test       NULL         NULL   NULL   NULL       NULL        NULL         NULL
 2     NULL       TEST         NULL   NULL   NULL       NULL        NULL         NULL
 3     NULL       NULL         Test   NULL   NULL       NULL        NULL         NULL
 4     NULL       NULL         NULL   Test   NULL       NULL        NULL         NULL
 5     NULL       NULL         NULL   NULL   Test       NULL        NULL         NULL
 6     NULL       NULL         NULL   NULL   NULL       Test        NULL         NULL
 7     NULL       NULL         NULL   NULL   NULL       NULL        TEST         NULL
 8     NULL       NULL         NULL   NULL   NULL       NULL        NULL         TEST

我想将每个非空字段的数据合并到 Fist 实例中,然后更新该记录并删除剩余的 7 条记录。

我仍然从 SQL 开始,但了解连接、插入、更新删除等。任何建议或指导将不胜感激。我发现了多个帖子,我可以在报告中合并这些数据,但没有很多帖子可以真正合并数据并删除重复的行。

我刚刚在搜索时发现了这篇文章,所以它可能是我正在寻找的 mysql-consolidate-duplicate-data-records-via-update-delete

4

2 回答 2

1

试试这个——

SET NOCOUNT ON;

DECLARE @temp TABLE
(
      CID INT PRIMARY KEY
    , LastName NVARCHAR(10)
    , FirstName NVARCHAR(10)
    , Street NVARCHAR(10)
    , City NVARCHAR(10)
    , ZipCode NVARCHAR(10)
    , HomePhone NVARCHAR(10)
    , CellPhone NVARCHAR(10)
    , EmailAddr NVARCHAR(10)
)

INSERT INTO @temp (CID, LastName, FirstName, Street, City, ZipCode, HomePhone, CellPhone, EmailAddr)
VALUES 
    (1,  'Test', NULL,   NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (2,  NULL,   'TEST', NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (3,  NULL,   NULL,   'Test', NULL,   NULL,   NULL, NULL, NULL),
    (4,  NULL,   NULL,   NULL,   'Test', NULL,   NULL, NULL, NULL),
    (5,  NULL,   NULL,   NULL,   NULL,   'Test', NULL, NULL, NULL),
    (6,  NULL,   NULL,   NULL,   NULL,   NULL,   'Test', NULL, NULL),
    (7,  NULL,   NULL,   NULL,   NULL,   NULL,   NULL, 'TEST', NULL),
    (8,  NULL,   NULL,   NULL,   NULL,   NULL,   NULL, NULL, 'TEST'),
    (12, 'Tes2',  NULL,  NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (14, NULL,   'TES2', NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (17, NULL,   NULL,   'Tes2', NULL,   NULL,   NULL, NULL, NULL),
    (18, 'Tes3', NULL,   NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (19, NULL,   'TES3', NULL,   NULL,   NULL,   NULL, NULL, NULL),
    (20, NULL,   NULL,   'Tes3', NULL,   NULL,   NULL, NULL, NULL),
    (21, NULL,   NULL,   NULL,   'Test3', NULL,   NULL, NULL, NULL)

DECLARE @buffer_temp TABLE
(
      CID INT PRIMARY KEY
    , LastName NVARCHAR(50)
    , FirstName NVARCHAR(50)
    , Street NVARCHAR(50)
    , City NVARCHAR(50)
    , ZipCode NVARCHAR(50)
    , HomePhone NVARCHAR(50)
    , CellPhone NVARCHAR(50)
    , EmailAddr NVARCHAR(50)
)

;WITH cte AS 
(
    SELECT t.CID, NextCID = ISNULL(t2.CID, (SELECT MAX(y.CID) FROM @temp y))  
    FROM @temp t
    OUTER APPLY (
        SELECT TOP 1 CID = t1.CID - 1
        FROM @temp t1
        WHERE t1.CID > t.CID
            AND t1.LastName IS NOT NULL
    ) t2
    WHERE t.LastName IS NOT NULL
)
INSERT INTO @buffer_temp
SELECT 
      t2.CID
    , LastName = MAX(LastName) 
    , FirstName = MAX(FirstName)
    , Street = MAX(Street)
    , City = MAX(City)
    , ZipCode = MAX(ZipCode)
    , HomePhone = MAX(HomePhone)
    , CellPhone = MAX(CellPhone)
    , EmailAddr = MAX(EmailAddr) 
FROM @temp t
CROSS APPLY (
    SELECT *
    FROM cte t2
    WHERE t.CID BETWEEN t2.CID AND t2.NextCID
) t2
GROUP BY t2.CID

DELETE FROM @temp

INSERT INTO @temp
SELECT * 
FROM @buffer_temp

SELECT * 
FROM @temp

输出:

CID         LastName   FirstName  Street     City       ZipCode    HomePhone  CellPhone  EmailAddr
----------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1           Test       TEST       Test       Test       Test       Test       TEST       TEST
12          Tes2       TES2       Tes2       NULL       NULL       NULL       NULL       NULL
18          Tes3       TES3       Tes3       Test3      NULL       NULL       NULL       NULL
于 2013-05-08T07:50:01.140 回答
0

看起来您想要合并记录 1-8,然后是 9-16,然后是 17-24,依此类推。

幸运的是,您有一个CID可用于识别组的字段。您所需要的只是组,而公式(CID - 1)/8可以解决问题(SQL Server 在整数除法时进行整数除法,例如,4/8 = 0 而不是 0.5)。这是查询:

select (CID - 1) / 8 as NewCID,
       max(LastName) as LastName, max(FirstName) as FirstName, . . . 
from t
group by (CID - 1) / 8;
于 2013-05-08T00:21:22.000 回答