2

编辑以使其更清晰 - 许多人为原始示例的混乱道歉

我有以下代表已婚夫妇的表结构:

id | Person | Spouse
______________________
1  | Mary   | John
2  | John   | Mary
3  | Katy   | Bob
4  | Bob    | Katy
5  | Mary   | John
6  | John   | Mary

在此示例中,玛丽与约翰结婚,凯蒂与鲍勃结婚,另一位玛丽与另一位约翰结婚。

我怎样才能找回这对已婚夫妇?

我已经接近了这个:

SELECT 
  p.id id1,
  q.id id2
FROM 
  people p 
  INNER JOIN people q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
ORDER BY p.id

但是,这会返回:

1 | 2 (1st Mary & 1st John)
1 | 6 (1st Mary & 2nd John) *problem*
2 | 5 (1st John & 2nd Mary) *problem*
3 | 4 (Katy & Bob)
5 | 6 (2nd Mary & 2nd John)

我如何确保第一任玛丽和第一任约翰只结过一次婚(即删除上面的问题行)?

非常感谢

这是创建示例的 SQL:

CREATE TABLE people
    (`id` int, `person` varchar(7), `spouse` varchar(7))
;

INSERT INTO people
    (`id`, `person`, `spouse`)
VALUES
    (1, 'Mary', 'John'),
    (2, 'John', 'Mary'),
    (3, 'Katy', 'Bob'),
    (4, 'Bob', 'Katy'),
    (5, 'Mary', 'John'),
    (6, 'John', 'Mary')
;

SELECT 
  p.id id1,
  q.id id2
FROM 
  people p 
  INNER JOIN people q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
ORDER BY p.id
;
4

5 回答 5

1

在这个例子中,玛丽嫁给了约翰,凯蒂嫁给了鲍勃,另一个玛丽嫁给了理查德。

您的节目数据结构中没有任何内容可以区分这两个“玛丽”,因为它们之间没有区别。

两者都只是文本文字Mary。如果您想区分可能同名的不同人,那么您需要另一个标准,并且需要一个独特的标准。(Fe 是每个人的数据库记录的id 。)

于 2013-03-16T22:43:59.003 回答
1

我会试一试:

SELECT
  p.id AS id1,
  q.id AS id2
FROM
  people AS p 
  JOIN people AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND 
    p.id < q.id
  JOIN (SELECT 
          p.id, COUNT(*) AS rank
        FROM 
          people AS p 
          INNER JOIN people AS p2 ON
            p.person = p2.person AND 
            p.spouse = p2.spouse AND 
            p.id >= p2.id
        GROUP BY p.id
       ) AS x ON
    x.id = p.id
  JOIN (SELECT 
          p.id, COUNT(*) AS rank
        FROM 
          people AS p 
          INNER JOIN people AS p2 ON
            p.person = p2.person AND 
            p.spouse = p2.spouse AND 
            p.id >= p2.id
        GROUP BY p.id
       ) AS y ON
    y.id = q.id AND
    y.rank = x.rank ;

还有一个:

SELECT
  p.id AS id1,
  q.id AS id2
FROM
  people AS p 
  JOIN people AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse
  JOIN people AS p2 ON
    p.person = p2.person AND 
    p.spouse = p2.spouse AND 
    p.id >= p2.id
  JOIN people AS q2 ON
    q.person = q2.person AND 
    q.spouse = q2.spouse AND 
    q.id >= q2.id
WHERE 
    p.id < q.id
GROUP BY 
    p.id, q.id
HAVING 
    COUNT(DISTINCT p2.id) = COUNT(DISTINCT q2.id) ;

两者都在SQL-Fiddle测试过

如果只有 MySQL 有窗口函数(就像几乎所有其他 DBMS 一样),它会简单得多。在Postgres fiddle测试:

WITH cte AS
  ( SELECT
        id, person, spouse, 
        ROW_NUMBER() OVER( PARTITION BY person, spouse 
                           ORDER BY id )
           AS rn
    FROM
        people
  ) 
SELECT
    p.id AS id1,
    q.id AS id2 
FROM
  cte AS p
  JOIN cte AS q ON
    p.person = q.spouse AND 
    q.person = p.spouse AND
    p.rn = q.rn AND
    p.id < q.id ;
于 2013-03-17T00:55:09.950 回答
0

您的数据库限制是错误的。

玛丽、约翰等人没有身份。

一些启发式查询可能会有所帮助,但这不是一个可靠的解决方案。

所以,请改进你的数据结构。

于 2013-03-17T01:55:49.410 回答
-1

不是很优雅,但有效:

SELECT p.id, q.id
FROM people p
INNER JOIN people q ON
p.person1 = q.person2 and 
q.person1 = p.person1

它实际上使用倒排的存在作为选择器

于 2013-03-16T22:54:14.247 回答
-1

有很多方法可以做到这一点,但是使用数据库的最重要原因之一是它包含大量数据 - 并且很少会编写检索大量数据的查询。除非在非常不寻常的情况下,并且对于家庭作业,结果应根据某些标准进行过滤。因此,最合适的解决方案取决于您稍后添加到查询中的其他内容。

但这里有几个关于如何获得唯一对的例子:

SELECT a, b, GROUP_CONCAT(id)
(SELECT id
, IF (person>=spouse, person, spouse) as a
, IF (person>=spouse, spouse, person) as b
FROM yourtable ) AS pairs
GROUP BY a,b;

SELECT id, person, spouse
FROM yourtable s1
WHERE NOT EXISTS ( SELECT 1
    FROM yourtable s2
    WHERE s2.id>s1.id
    AND s1.person=s2.spouse
    AND s1.spouse=S2.person);

(还有其他几种解决方案)。

于 2013-03-16T23:41:39.497 回答