php - 在mysql简化示例中查找重复行

Question

忏悔：一个 mysql newb 需要简单的例子来定位有点大的表中的重复行。我已经搜索并阅读了许多具有类似标题的其他主题，但是这些示例非常复杂，以至于我无法将它们应用于我的基本情况。

一个 MySQL 表只有 5 个字段，但有数百行。我希望找到重复的行——我知道肯定有一个，想知道是否还有其他行。

示例行：（rel_id 是自动递增的，主键字段）

'rel_id' => 1
'host' => 17
'host_type' => 'client'
'rep' => 7
'rep_type => 'cli_mgr'

我的方法是：
1. 将整个表读入 mysql 查询
2. 逐行比较 4 个数据字段与之前（“完成”）行的那些
3. 比较“新”行后，将其附加到数组“完成”行

这是我尝试过的。我确信必须有一个更简单的解决方案。您会看到我在尝试将“新”行附加到“完成”行数组时陷入困境：

$rRels = mysql_query("SELECT * FROM `rels`");
$a = array();
$e = array();
$c1 = 0;
$c2 = 0;
While ($r = mysql_fetch_assoc($rRels)) {
    $i = $r['rel_id'];
    $h = $r['host'];
    $ht = $r['host_type'];
    $r = $r['rep'];
    $rt = $r['rep_type'];

    foreach($a as $row) {
        $xh = $row['host'];
        $xht = $row['host_type'];
        $xr = $row['rel'];
        $xrt = $row['rel_type'];

        if (($h==$xh) && ($ht==$xht) && ($r==$xr) && ($rt==$xrt)) {
            echo 'Found one<br>';
            $e[] = $r;
        }
        $c2++;
    }
    $a = array_merge(array('rel_id'=>$i, 'host'=>$h, 'host_type'=>$ht, 'rep'=>$r, 'rep_type'=>$rt), $a);
    $c1++;
}

echo '<h3>Duplicate Rows:</h3>';
foreach ($e as $row) {
    print_r($row);
    echo '<br>';
}
echo '<br><br>';
echo 'Counter 1: ' . $c1 . '<br>';
echo 'Counter 2: ' . $c2 . '<br>';

score 4 · Accepted Answer

这应该可以解决问题：

SELECT COUNT(*) as cnt, GROUP_CONCAT(rel_id) AS ids
FROM rels
GROUP BY host, host_type, rep, rep_type
HAVING cnt > 1

任何“重复”记录的 cnt > 1，group_concat 将为您提供重复记录的 ID。

score 1 · Accepted Answer

纯no-php解决方案：制作旧表（名为oldTable）的副本，没有数据

create table newTable like oldTable;

修改结构以防止重复并在所有 5 列上添加唯一键。

alter table newTable add unique index(rel_id,host,host_type,rep,rep_type );

然后用 sql 查询从 oldTable 复制行

insert IGNORE into newTable select * from oldTable

在 newTable 中，您只有唯一数据。

另一个选项是 group by，如果您将获得使用的重复行数

select  concat_ws('_',rel_id,host,host_type,rep,rep_type) as str, count(*) 
from oldTable 
group by str

score 0 · Accepted Answer

您可以通过此查询查找所有重复的行。希望它应该很容易集成到您的 PHP 代码中。

// This will give you all the duplicates
// Self join where all the columns have the same values but different primary keys

SELECT * 
FROM   rels t1, rels t2
WHERE  t1.rel_id != t2.rel_id
AND    t1.host = t2.host
AND    t1.host_type = t2.host_type
AND    t1.rep = t2.rep
AND    t1.rep_type = t2.rep_type

score 0 · Accepted Answer

在 SQL 中查找重复项比在 PHP 中更容易。

SELECT GROUP_CONCAT(rel_id)
FROM rels
GROUP BY host, host_type, rep, rep_type HAVING COUNT(rel_id)>1;

这将显示指向相同记录的 rel_id 组。该HAVING COUNT(rel_id)>1子句允许我们跳过不重复的记录。

php - 在mysql简化示例中查找重复行

4 回答 4

Related

Reference