0

几天来,我一直在研究根据特定字段查找重复行的正确方法。我想我需要更多帮助——

 SELECT * 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId;

目标是获取重复项并将它们放入存档表 (enrollees_duplicates),然后从活动表 (enrollees) 中删除重复项。我尝试编写一个查询来查找并插入重复的行,但它给了我以下错误:

“列计数与第 1 行的值计数不匹配”

我尝试使用的查询是:

INSERT INTO enrollees_duplicates (SELECT * 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId);

我认为这是因为我没有检索 INNER JOIN 选择中的所有列?如果是这种情况,如果我将其更改为 SELECT * (带有 MinId 和 count 添加),它是否仍然会抛出相同的错误,因为新表中不存在两个额外的列?

有什么方法可以使用 SQL 查询完成所有工作,而不必选择重复项,将它们存储在 PHP 数组中,然后使用另一个 SQL 查询来拉每一行,将其插入到重复表中,然后再使用另一个SQL查询以删除重复的行。

我的意图是使用两个查询。一个用于将所有重复行插入存档表,另一个用于删除重复行。如果它可以以某种方式变成一个查找重复项的查询,将它们插入存档表,然后删除它们 - 一次运行,那就更好了。

作为该领域的新手,任何帮助或指导将不胜感激。

4

2 回答 2

0

“列计数与第 1 行的值计数不匹配”

表enrollees_duplicates 和enrollees 具有不同的结构。

使用 ON DELETE TRIGGER 可能会更好?(http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html)。

于 2013-10-18T20:40:07.230 回答
0

我的问题的解决方案是,当我的第一个选择只是“*”时,它将两个额外的列(MinId,count)添加到结果中,这使得列数不同。通过仅获取“enrollees”表的结果而不是子查询的附加参数,它可以更正列差异。

INSERT INTO enrollees_duplicates (SELECT enrollees.* 
    FROM enrollees
    INNER JOIN (SELECT first_name, last_name, address1, city, state, zip, program_instance_id, MIN(id) AS MinId, COUNT(id) AS count FROM enrollees GROUP BY first_name, last_name, address1, city, state, zip, program_instance_id) b
    ON enrollees.first_name = b.first_name
        AND enrollees.last_name = b.last_name 
        AND enrollees.address1 = b.address1
        AND enrollees.city = b.city
        AND enrollees.state = b.state
        AND enrollees.zip = b.zip 
        AND count > 1 
        AND enrollees.program_instance_id = b.program_instance_id 
        AND enrollees.id != MinId);
于 2013-10-19T03:10:44.427 回答