0

我有一个包含三列的数据库表。标识、用户标识、图书标识。在此表中,有一些重复项。一个 user_id 应该只有一个 book_id 的记录,但在某些情况下,一个 user_id 有几个 book_id。已经有几百万条记录,我想知道如何删除任何重复项。

4

3 回答 3

1

尝试跟随。

SQL 服务器

WITH ORDERED AS
(
    SELECT id
    ROW_NUMBER() OVER (PARTITION BY [user_id] , [book_id] ORDER BY id ASC) AS rn
    FROM
    tableName
)
delete from tableName
where id in ( select id from ORDERED where rn != 1)

MYSQL

delete from tableName
where id not in( 
    select MIN(id)from tableName    
    group by user_id, book_id
)

根据评论编辑 - 在 MySQL 中,您不能修改在 SELECT 部分中使用的同一个表

这将解决问题。

delete from tableName
where id not in( 
    select temp.temp_id from (
        select MIN(id) as temp_id from tableName    
        group by user_id, book_id
    ) as temp
)

这将只保留 (user_id, book_id) 的一种组合

于 2013-03-08T12:13:42.403 回答
0

如果您在下面执行此语句,它将删除所有重复记录,并为每个user_ID只留下最大的IDuser_ID

DELETE  a
FROM    tableName a
        LEFT JOIN
        (
            SELECT  user_ID, MAX(ID) max_ID
            FROM    tableName
            GROUP   BY user_ID
        ) b ON a.user_ID = b.user_ID AND
                a.ID = b.max_ID
WHERE   b.max_ID IS NULL
于 2013-03-08T12:10:32.683 回答
0

希望此查询将允许您删除重复项:

DELETE bl1 FROM book_log bl1
JOIN book_log bl2
ON (
    bl1.id > bl2.id AND
    bl1.user_id = bl2.user_id AND
    bl1.book_id = bl2.book_id
);

演示

于 2013-03-08T12:47:22.893 回答