我有以下查询基本上可以在我的用户名列中找到所有重复项:
SELECT `username`
FROM `instagram_user`
GROUP BY `username`
HAVING COUNT( * ) >1
如何删除所有重复项,以便在表中只留下一个唯一的用户名?只要表中有一个唯一的用户名,我不在乎是哪个实体被持久化或删除。
如果您不在乎选择什么记录,那么只需在使用时添加唯一约束IGNORE
ALTER IGNORE TABLE instagram_user ADD UNIQUE (username);
这是SQLFiddle演示
MySQL 将为您完成这项工作。无论如何,您都希望拥有该唯一约束,以防止您的表在未来出现重复。
或者你可以做
DELETE t
FROM instagram_user t JOIN
(
SELECT username, MAX(id) id
FROM instagram_user
GROUP BY username
HAVING COUNT(*) > 1
) q
ON t.username = q.username
AND t.id <> q.id
对于具有重复用户名的行,这将只留下具有最大 id 的行。
这是SQLFiddle演示
不确定这是针对 SQL 服务器的,您可以在 mysql 中尝试类似的代码。
;With CteUsers AS(
SELECT *,ROW_NUMBER() OVER (PARTITION BY username Order by username) AS ROWID
FROM(
SELECT PkId, `username`
FROM `instagram_user`
)tbltemp)
SELECT * FROM CteUsers;
这将导致如下
PkId username RowId
1 xx 1
2 xx 2
....
然后删除 where RowId > 1
;使用 CteUsers AS(
SELECT *,ROW_NUMBER() OVER (PARTITION BY username Order by username) AS ROWID
FROM(
SELECT PkId, `username`
FROM `instagram_user`
)tbltemp)
DELETE instagram_user WHERE PkId iN (SELECT PkId FROM CteUsers WHERE ROWID > 1);
这将为您提供重复项(即您需要删除的那些)...
select a.id, a.username from instagram_user a, instagram_user b
where a.username = b.username and a.id <> b.id
and b.id = (select min(id) from instagram_user where username = a.username)
所以 DELETE 就像...
delete from instagram_user where id in
(select a.id from instagram_user a, instagram_user b
where a.username = b.username and a.id <> b.id
and b.id = (select min(c.id) from instagram_user c
where c.username = a.username))