2

我正在使用 SQL Server Management Studio 2012。我有一个包含数千行数据的表。许多行是重复的,我需要删除它们。每行都有一个唯一标识符[OwnerID],该标识符设置为 Identity Specification,Identity Increment 为 1。对于每一行,重复项位于以下列中:[FirstName][LastName][CompanyName]

所以我需要删除这 3 列中具有重复值组合的行。删除后,我是否可以编写 t-sql 来将身份规范重置 [OwnerID]为从第一行的 1 开始,并以 1 的增量为其余行分配值?

感谢您的任何帮助。

4

3 回答 3

1

这是删除行的一种方法,使用row_number()保留第一个:

with todelete as (
    select t.*,
           row_number() over (partition by firstname, lastname, companyname
                              order by ownerid) as seqnum
    from t
)
delete from todelete
    where seqnum = 1;

要重置 ownerid,也可以使用类似的思路:

with toupdate as (
    select t.*, row_number() over (order by ownerid) as seqnum
    from t
   )
update toupdate
    set ownerid = seqnum;

但是,您应该对此非常小心。在设计良好的数据库中,名为的字段将引用名为or 或OwnerID的表中的列。更改 ids 的值可能会对其他表产生影响。Owner.OwnerIdOwner.Id

于 2013-01-21T03:37:32.380 回答
0

这是我使用临时表SQL Fiddle的尝试:

SELECT 
firstName, LastName, CompanyName, COUNT(*) thecount, MIN(ID) min_id
INTO #temp
FROM tab
GROUP BY firstName, LastName, CompanyName;

SELECT 
a.id ID, b.min_id
INTO #temp1
FROM tab a, #temp b
WHERE  a.firstName  = b.FirstName 
AND a.LastName    = b.LastName
AND a.CompanyName = b.CompanyName
AND b.thecount > 1;

-- run this query on all referenced tables:
UPDATE tab2 SET tab2.ID = t.min_id
FROM tab2, #temp1 t
WHERE tab2.ID = t.ID;

DELETE t
FROM tab t, #temp1 a
WHERE t.id = a.id and a.id <> a.min_id;
于 2013-01-21T03:51:52.617 回答
0

如果要删除重复数据并保留一个值,请使用此查询并用逗号分隔多列

删除 a from (select *, rn=row_number() over (partition by FirstName order by MemberId) from Membership ) a where rn > 1;

于 2013-08-01T12:17:52.093 回答