10

我的表中有重复的行,如何根据单个列的值删除它们?

例如

uniqueid, col2, col3 ...
1, john, simpson
2, sally, roberts
1, johnny, simpson

delete any duplicate uniqueIds
to get 

1, John, Simpson
2, Sally, Roberts
4

6 回答 6

35

您可以DELETE从 cte:

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1

ROW_NUMBER()函数为每一行分配一个数字。 PARTITION BY用于为该组中的每个项目开始编号,在这种情况下,每个值uniqueid将从 1 开始编号并从那里上升。 ORDER BY确定数字进入的顺序。由于每个uniqueid都从 1 开始编号,因此任何ROW_NUMBER()大于 1 的记录都有重复uniqueid

要了解该ROW_NUMBER()功能的工作原理,只需尝试一下:

SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank'
FROM Table
ORDER BY uniqueid

您可以调整函数的逻辑ROW_NUMBER()来调整您将保留或删除的记录。

例如,也许您想分多个步骤执行此操作,首先删除具有相同姓氏但不同名字的记录,您可以将姓氏添加到PARTITION BY

WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank'
             FROM Table)
DELETE FROM cte 
WHERE RowRank > 1
于 2013-08-15T15:40:19.350 回答
3

您可能有一个由数据库在插入时分配的行 ID,并且实际上是唯一的。在我的示例中,我将调用此 rowId。

rowId |uniqueid |col2  |col3
----- |-------- |----  |----
1      10        john   simpson
2      20        sally  roberts
3      10        johnny simpson

您可以通过对应该是唯一的事物(无论是一列还是多列)进行分组来删除重复项,然后从每个组中获取一个 rowId,并删除除这些 rowId 之外的所有其他内容。在内部查询中,表中的所有内容都会有一个 rowId,除了重复的行。

select * 
--DELETE 
FROM MyTable 
WHERE rowId NOT IN 
(SELECT MIN(rowId) 
 FROM MyTable 
 GROUP BY uniqueid);

您也可以使用 MAX 而不是 MIN 来获得类似的结果。

于 2014-11-26T06:07:13.573 回答
2
DECLARE @du TABLE (
    id INT,  
    Name VARCHAR(4)
)

INSERT INTO @du VALUES(1,'john')
INSERT INTO @du VALUES(2,'jane')
INSERT INTO @du VALUES(1,'john')

;WITH dup (id,dp)
AS
(SELECT id
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY Name) AS dp
FROM @du)
DELETE FROM dup
WHERE dp > 1

SELECT *
FROM @du
于 2013-08-15T15:48:01.597 回答
2

这是删除重复项的简单魔术

select * into NewTable from ExistingTable
union
select * from ExistingTable;
于 2014-08-17T14:07:47.193 回答
1

DELETE FROM table WHERE uniqueid='1' AND col2='john' 或者您更改col2='john'col2='johnny'. 取决于您要删除的记录。

你最初是如何得到两个相同的“唯一”ID 的?

于 2013-08-15T15:40:22.460 回答
1

您有很多方法可以删除重复记录,其中一些方法如下............

删除重复记录的不同方法

使用 Row_Number() 函数和 CTE

  with CTE(DuplicateCount) as  ( SELECT  ROW_NUMBER() OVER
(PARTITION by UniqueId order by UniqueId ) as DuplicateCount from
Table1 ) Delete from CTE where DuplicateCount > 1

  .Without using CTE*

Delete DuplicateCount from ( Select Row_Number() over(Partition by
UniqueId order by UniqueId) as Dup from Table1 ) DuplicateCount 
where DuplicateCount.Dup > 1

 .Without using row_Number() and CTE

Delete from Subject where RowId not in(select Min(RowId ) from
Subject group by UniqueId)
于 2015-12-24T18:25:32.353 回答