1

我有一个多对多的关系R(两列绑定表,两列都是同一个表的外键)。现在,我想找到所有x, y这样的对x,y in Ry,x in R。当我的表包含数十万行时,最快的方法是什么?我正在使用mysql。

4

3 回答 3

2
SELECT  LEAST(x, y) l, GREATEST(x, y) g
FROM    r
GROUP BY
        l, g
HAVING  COUNT(*) > 1
于 2012-06-01T18:26:20.590 回答
2

最快的方法大概是这样的:

select (case when x < y then x else y end) as themin,
       (case when x < y then y else x end) as themax
from (select distinct x, y from R) r
group by (case when x < y then x else y end), (case when x < y then y else x end)
having count(*) > 1

这会对 x 和 y 值进行排序,因此当您对它们进行分组时,无论原始顺序如何,它们都是按规范顺序排列的。

You can dispense with the "select distinct" if you know the pairs are already distinct in the R table.

The alternative is some sort of self join (either explicitly or using IN or NOT IN). You can try different ways, but I think this is probably the fastest.

于 2012-06-01T18:29:39.323 回答
0

做一个自我加入:

SELECT  *
FROM    R T1
        INNER JOIN R T2
            ON T1.X = T2.Y
            AND T1.Y = T2.X

附录

如果您不想返回重复的行(即 1、2 和 2、1),您可以使用:

SELECT  *
FROM    R T1
        INNER JOIN R T2
            ON T1.X = T2.Y
            AND T1.Y = T2.X
WHERE   T1.X < T1.Y;

但是,如果您不想返回重复的行,我怀疑 Quassnoi 的解决方案可能会执行得更好,但这将取决于您的数据和指标。

于 2012-06-01T18:25:49.647 回答