57

我有下表,其中包含以下记录

create table employee
(
 EmpId number,
 EmpName varchar2(10),
 EmpSSN varchar2(11)
);

insert into employee values(1, 'Jack', '555-55-5555');
insert into employee values (2, 'Joe', '555-56-5555');
insert into employee values (3, 'Fred', '555-57-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');
insert into employee values (1, 'Jack', '555-55-5555');
insert into employee values (4, 'Mike', '555-58-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6 ,'Lisa', '555-70-5555');
insert into employee values (5, 'Cathy', '555-59-5555');
insert into employee values (6, 'Lisa', '555-70-5555');

我在这个表中没有任何主键。但我的表中已经有上述记录。我想删除 EmpId 和 EmpSSN 字段中具有相同值的重复记录。

例如:Emp id 5

谁能帮我构建一个查询以删除那些重复的记录

提前致谢

4

19 回答 19

83

这很简单。我在 SQL Server 2008 中尝试过

DELETE SUB FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt
 FROM Employee) SUB
WHERE SUB.cnt > 1
于 2011-09-12T12:22:38.567 回答
59

添加主键(代码如下)

运行正确的删除(下面的代码)

考虑一下为什么您不想保留该主键。


假设 MSSQL 或兼容:

ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;

WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1)
BEGIN
    DELETE FROM Employee WHERE EmployeeID IN 
    (
        SELECT MIN(EmployeeID) as [DeleteID]
        FROM Employee
        GROUP BY EmpID, EmpSSN
        HAVING COUNT(*) > 1
    )
END
于 2009-06-12T07:23:19.373 回答
24

使用行号来区分重复记录。保留 EmpID/EmpSSN 的第一行号并删除其余的:

    DELETE FROM Employee a
     WHERE ROW_NUMBER() <> ( SELECT MIN( ROW_NUMBER() )
                               FROM Employee b
                              WHERE a.EmpID  = b.EmpID
                                AND a.EmpSSN = b.EmpSSN )
于 2009-06-12T17:01:50.077 回答
12
With duplicates

As
(Select *, ROW_NUMBER() Over (PARTITION by EmpID,EmpSSN Order by EmpID,EmpSSN) as Duplicate From Employee)

delete From duplicates

Where Duplicate > 1 ;

这将更新表格并从表格中删除所有重复项!

于 2011-12-06T16:38:02.083 回答
8
select distinct * into newtablename from oldtablename

现在,newtablename将没有重复记录。

newtablename只需在 sql server 的对象资源管理器中按 F2即可更改表名( )。

于 2012-06-20T11:57:16.330 回答
7

代码

DELETE DUP 
FROM 
( 
    SELECT ROW_NUMBER() OVER (PARTITION BY Clientid ORDER BY Clientid ) AS Val 
    FROM ClientMaster 
) DUP 
WHERE DUP.Val > 1

解释

使用内部查询在表上构建一个视图,该视图包含一个基于 的字段Row_Number(),由您希望唯一的那些列进行分区。

从此内部查询的结果中删除,选择行号不为 1 的任何内容;即重复项;不是原版。

order by有效语法需要 row_number 窗口函数的子句;您可以在此处输入任何列名。如果您希望更改哪些结果被视为重复(例如保留最早的或最近的等),那么此处使用的列很重要;即,您要指定顺序,以便您希望保留的记录在结果中排​​在第一位。

于 2016-09-27T06:40:21.370 回答
6

您可以创建一个#tempemployee包含select distinct您的表的临时employee表。然后delete from employee。然后insert into employee select from #tempemployee

就像 Josh 说的那样 - 即使您知道重复项,删除它们也是不可能的,因为如果它与另一条记录完全重复,您实际上无法引用特定记录。

于 2009-06-12T07:16:34.740 回答
2

如果不想创建新的主键,可以在 SQL Server 中使用 TOP 命令:

declare @ID int
while EXISTS(select count(*) from Employee group by EmpId having count(*)> 1)
begin
    select top 1 @ID = EmpId
    from Employee 
    group by EmpId
    having count(*) > 1

    DELETE TOP(1) FROM Employee WHERE EmpId = @ID
end
于 2010-06-02T21:30:41.073 回答
2

查询下方的 ITS 易于使用

WITH Dups AS
(
  SELECT col1,col2,col3,
ROW_NUMBER() OVER(PARTITION BY col1,col2,col3 ORDER BY (SELECT 0)) AS rn
 FROM mytable
)
DELETE FROM Dups WHERE rn > 1
于 2016-09-19T10:20:10.480 回答
1

删除 sub from (select ROW_NUMBER() Over(partition by empid order by empid)cnt from employee)sub where sub.cnt>1

于 2018-11-28T03:24:46.680 回答
0

ID,不需要rowcount()temp table不需要....

WHILE 
  (
     SELECT  COUNT(*) 
     FROM TBLEMP  
     WHERE EMPNO 
            IN (SELECT empno  from tblemp group by empno having count(empno)>1)) > 1 


DELETE top(1)  
FROM TBLEMP 
WHERE EMPNO IN (SELECT empno  from tblemp group by empno having count(empno)>1)
于 2013-04-14T05:56:16.523 回答
0

表 ID 和名称中有两列,其中名称以不同的 ID 重复,因此您可以使用此查询:. .

DELETE FROM dbo.tbl1
WHERE id NOT IN (
     Select MIN(Id) AS namecount FROM tbl1
     GROUP BY Name
)
于 2013-06-18T13:35:03.700 回答
0

我不是 SQL 专家,所以请多多包涵。我相信你很快就会得到更好的答案。以下是查找重复记录的方法。

select t1.empid, t1.empssn, count(*)
from employee as t1 
inner join employee as t2 on (t1.empid=t2.empid and t1.empssn = t2.empssn)
group by t1.empid, t1.empssn
having count(*) > 1

删除它们会更加棘手,因为数据中没有任何内容可以在删除语句中用于区分重复项。我怀疑答案将涉及 row_number() 或添加标识列。

于 2009-06-12T07:18:02.087 回答
0

拥有一个没有主键的数据库表确实并且会说非常糟糕的做法......所以在添加一个之后(ALTER TABLE)

运行这个直到你看不到任何重复的记录(这就是 HAVING COUNT 的目的)

DELETE FROM [TABLE_NAME] WHERE [Id] IN 
(
    SELECT MAX([Id])
    FROM [TABLE_NAME]
    GROUP BY [TARGET_COLUMN]
    HAVING COUNT(*) > 1
)


SELECT MAX([Id]),[TABLE_NAME], COUNT(*) AS dupeCount
FROM [TABLE_NAME]
GROUP BY [TABLE_NAME]
HAVING COUNT(*) > 1

MAX([Id]) 将导致删除最新记录(第一次创建后添加的记录),以防您想要相反的意思,如果需要删除第一条记录并保留插入的最后一条记录,请使用 MIN([Id])

于 2014-07-19T04:08:45.400 回答
0
创建唯一聚集索引 Employee_idx
关于员工 ( EmpId,EmpSSN )
带有ignore_dup_key

如果不需要,可以删除索引。

于 2010-07-16T07:49:11.933 回答
-1
select t1.* from employee t1, employee t2 where t1.empid=t2.empid and t1.empname = t2.empname and t1.salary = t2.salary
group by t1.empid, t1.empname,t1.salary having count(*) > 1
于 2009-10-06T11:52:04.163 回答
-1

从 rowid 所在的员工中删除(select rowid from (select rowid, name_count from (select rowid, count(emp_name) as name_count from employee group by emp_id, emp_name) where name_count>1))

于 2020-10-03T16:27:55.090 回答
-2
DELETE FROM 'test' 
USING 'test' , 'test' as vtable
WHERE test.id>vtable.id and test.common_column=vtable.common_column  

使用它我们可以删除重复的记录

于 2010-11-09T11:04:51.490 回答
-3
ALTER IGNORE TABLE 测试
           添加唯一索引'test'('b');

@这里的'b'是唯一性的列名,@这里的'test'是索引名。

于 2010-11-09T10:18:33.273 回答