2

在 MS SQL Server 中,我试图从包含空值的表中删除重复项。阙呻吟。很多很多NULLs。底线是我需要保留一份带有或不带有NULLs 的任何重复记录的副本。我基本上想在操作期间表现NULL得像一个值为 " NULL" 的普通记录,然后回到真正的NULL. 这可能吗?有没有更简单的解决方案?

Table1好像:

UID        Data1    Data2   
1           A        NULL        
2           A        NULL       
3           B        abc     
4           B        abc       
5           C        NULL      
6           D        ghj

我希望该命令丢弃第 2 行和第 4 行并保留其余部分。(SELECT 用于测试。)

;SELECT UID, Data1, Data2
 FROM Table1 AS T
 WHERE NOT EXISTS (
    SELECT 1
    FROM table1 AS T2
    WHERE 
      T2.Data1 = T.Data1
      AND T2.Data2 = T.Data2
      AND T2.UID >= T.UID
      )
    AND Data1 IS NOT NULL

注意:SELECT DISTINCT 将不起作用,因为重复项具有不同的时间戳。

4

4 回答 4

3

这应该这样做:

;WITH CTE AS
(
    SELECT  *,
            RN = ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID)
    FROM table1
)
DELETE
--SELECT *
FROM CTE
WHERE RN > 1

更新以下评论

好的,如果您在删除该数量的行时遇到问题,那么您可以尝试使用要删除的 Id 创建一个查找表,然后进行批量删除(不过,您必须测试批量行数量) . 这是一个想法(假设UID是一个pk):

;WITH CTE AS
(
    SELECT  *,
            RN = ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID)
    FROM table1
)
SELECT [UID]
INTO RowsToDelete
FROM CTE
WHERE RN > 1;

CREATE INDEX I_UID ON RowsToDelete([UID]);

WHILE 1=1
BEGIN
    DELETE TOP (10000)
    FROM table1 T
    INNER JOIN RowsToDelete L
          ON T.[UID] = L.[UID]
    IF @@ROWCOUNT < 10000 BREAK;
END
于 2013-05-14T15:03:57.167 回答
0

SELECT DISTINCT Data1, Data2 FROM Table1还不够吗?

于 2013-05-14T15:00:51.443 回答
0

试试这个

  ;WITH uTable AS (
    SELECT UID, Data1, Data2, ROW_NUMBER() OVER (PARTITION BY Data1,Data2 ORDER BY UID DESC) as rownum
     FROM Table1 AS T)

    SELECT UID, Data1, Data2
    FROM uTable
    WHERE rownum = 1
于 2013-05-14T15:08:01.367 回答
0

我的解决方案:

declare @data TABLE (UID int, Data1 char(1), Data2 Char(3))

-- Your example data
INSERT INTO @data (UID, Data1, Data2)
VALUES (1,'A',NULL),(2,'A',NULL),(3,'B','abc'),(4,'B','abc'),(5,'C',NULL),(6,'D','ghj')

DELETE FROM @data WHERE UID in (
  SELECT UID FROM (
    SELECT UID, ROW_NUMBER() OVER(PARTITION BY Data1,Data2 ORDER BY UID) as RowNo FROM @data
  ) d WHERE d.rowNo>1
)

SELECT UID, Data1, Data2 FROM @data          
于 2013-05-14T15:18:56.530 回答