第二个修订答案
由于评论指出 T2 中可能存在重复行,因此需要更复杂的解决方案。我相信这是一个生成正确数据的查询。
-- Query 8B
SELECT x.id
FROM (SELECT d2.id, d2.c, d2.d
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS d2
JOIN (SELECT id
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id)
) AS j2
ON j2.id = d2.id
) AS x
JOIN (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS y
ON x.c = y.c AND x.d = y.d
GROUP BY x.id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id);
我怀疑这是否是最简单的,但它是先前修订答案的合乎逻辑的延续。
示例运行
这是查询的跟踪输出,显示了开发过程中的步骤。DBMS 是在 Mac OS X 10.7.4 上运行的 IBM Informix Dynamic Server 11.70.FC2,使用 SQLCMD v88.00 作为 SQL 命令解释器(不,不是微软的 johnny-come-lately;我二十多年前第一次写的那个) .
+ BEGIN;
+ CREATE TABLE T1
(ID INTEGER NOT NULL PRIMARY KEY, a CHAR(1) NOT NULL, b CHAR(1) NOT NULL);
+ INSERT INTO T1 VALUES(1, 'k', 'l');
+ INSERT INTO T1 VALUES(2, 'k', 'l');
+ INSERT INTO T1 VALUES(3, 'a', 'b');
+ INSERT INTO T1 VALUES(4, 'p', 'q');
+ INSERT INTO T1 VALUES(5, 't', 'v');
+ CREATE TABLE T2
(IDFK INTEGER NOT NULL REFERENCES T1, c CHAR(1) NOT NULL, d CHAR(1) NOT NULL);
+ INSERT INTO T2 VALUES(1, 'w', 'x');
+ INSERT INTO T2 VALUES(1, 'y', 'z');
+ INSERT INTO T2 VALUES(2, 'w', 'x');
+ INSERT INTO T2 VALUES(2, 'w', 'x');
+ INSERT INTO T2 VALUES(2, 'y', 'z');
+ INSERT INTO T2 VALUES(3, 'w', 'x');
+ INSERT INTO T2 VALUES(3, 'y', 'b');
+ INSERT INTO T2 VALUES(3, 'y', 'z');
+ INSERT INTO T2 VALUES(4, 'w', 'x');
+ INSERT INTO T2 VALUES(5, 'w', 'x');
+ INSERT INTO T2 VALUES(5, 'y', 'z');
+ INSERT INTO T2 VALUES(5, 'w', 'x');
+ INSERT INTO T2 VALUES(5, 'y', 'z');
+ SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1;
2|w|x
2|y|z
3|w|x
3|y|b
3|y|z
4|w|x
5|w|x
5|y|z
+ SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1;
1|w|x
1|y|z
+ SELECT id, COUNT(*) FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x GROUP BY id;
2|2
5|2
3|3
4|1
+ SELECT id, COUNT(*) FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x GROUP BY id;
1|2
+ -- Query 5B - IDs having same count of distinct rows as ID = 1
SELECT id
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id);
2
5
+ -- Query 6B
SELECT d2.id, d2.c, d2.d
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS d2
JOIN (SELECT id
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id)
) AS j2
ON j2.id = d2.id
ORDER BY id;
2|w|x
2|y|z
5|w|x
5|y|z
+ -- Query 7B
SELECT x.id, y.id, x.c, y.c, x.d, y.d
FROM (SELECT d2.id, d2.c, d2.d
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS d2
JOIN (SELECT id
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id)
) AS j2
ON j2.id = d2.id
) AS x
JOIN (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS y
ON x.c = y.c AND x.d = y.d
ORDER BY x.id, y.id, x.c, x.d;
2|1|w|w|x|x
2|1|y|y|z|z
5|1|w|w|x|x
5|1|y|y|z|z
+ -- Query 8B
SELECT x.id
FROM (SELECT d2.id, d2.c, d2.d
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS d2
JOIN (SELECT id
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk != 1) AS x
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id)
) AS j2
ON j2.id = d2.id
) AS x
JOIN (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS y
ON x.c = y.c AND x.d = y.d
GROUP BY x.id
HAVING COUNT(*) = (SELECT COUNT(*)
FROM (SELECT DISTINCT idfk AS id, c, d FROM t2 WHERE idfk = 1) AS x
GROUP BY id);
2
5
+ ROLLBACK;
第一次修订答案
第 1 步:具有与 ID = 1 相同行数的 ID
SELECT idfk AS id -- Query 5
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1);
第二步:数据对应Query 5
SELECT idfk AS id, c, d -- Query 6
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
ORDER BY id;
第 3 步:将查询 6 中的行与 ID = 1 的行连接起来
SELECT x.id, y.id, x.c, y.c, x.d, y.d -- Query 7
FROM (SELECT idfk AS id, c, d
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
) AS x
JOIN (SELECT idfk AS id, c, d
FROM t2 WHERE idfk = 1
) AS y
ON x.c = y.c AND x.d = y.d
ORDER BY x.id, y.id, x.c, x.d;
第 4 步:来自查询 7 的 ID,其中计数与 ID = 1 的计数相同
SELECT x.id
FROM (SELECT idfk AS id, c, d
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
) AS x
JOIN (SELECT idfk AS id, c, d
FROM t2 WHERE idfk = 1
) AS y
ON x.c = y.c AND x.d = y.d
GROUP BY x.id
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1);
示例运行
DBMS 是在 Mac OS X 10.7.4 上运行的 IBM Informix Dynamic Server 11.70.FC2,使用 SQLCMD v88.00 作为 SQL 命令解释器(不,不是微软的 johnny-come-lately;我二十多年前第一次写的那个) .
+ BEGIN;
+ CREATE TABLE T1
(ID INTEGER NOT NULL PRIMARY KEY, a CHAR(1) NOT NULL, b CHAR(1) NOT NULL);
+ INSERT INTO T1 VALUES(1, 'k', 'l');
+ INSERT INTO T1 VALUES(2, 'k', 'l');
+ INSERT INTO T1 VALUES(3, 'a', 'b');
+ INSERT INTO T1 VALUES(4, 'p', 'q');
+ CREATE TABLE T2
(IDFK INTEGER NOT NULL REFERENCES T1, c CHAR(1) NOT NULL, d CHAR(1) NOT NULL);
+ INSERT INTO T2 VALUES(1, 'w', 'x');
+ INSERT INTO T2 VALUES(1, 'y', 'z');
+ INSERT INTO T2 VALUES(2, 'w', 'x');
+ INSERT INTO T2 VALUES(2, 'y', 'z');
+ INSERT INTO T2 VALUES(3, 'w', 'x');
+ INSERT INTO T2 VALUES(3, 'y', 'b');
+ INSERT INTO T2 VALUES(3, 'y', 'z');
+ INSERT INTO T2 VALUES(4, 'w', 'x');
+ SELECT t1.id AS id, t2.c, t2.d -- Query 1
FROM t1
JOIN t2 ON t1.id = t2.idfk;
1|w|x
1|y|z
2|w|x
2|y|z
3|w|x
3|y|b
3|y|z
4|w|x
+ -- Query 5 - IDs having same count of rows as ID = 1
SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1);
2
+ SELECT idfk AS id, c, d
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
ORDER BY id;
2|w|x
2|y|z
+ SELECT x.id, y.id, x.c, y.c, x.d, y.d
FROM (SELECT idfk AS id, c, d
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
) AS x
JOIN (SELECT idfk AS id, c, d
FROM t2 WHERE idfk = 1
) AS y
ON x.c = y.c AND x.d = y.d
ORDER BY x.id, y.id, x.c, x.d;
2|1|w|w|x|x
2|1|y|y|z|z
+ SELECT x.id
FROM (SELECT idfk AS id, c, d
FROM t2
JOIN (SELECT idfk AS id
FROM t2
WHERE idfk != 1
GROUP BY idfk
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1)
) AS j2
ON j2.id = t2.idfk
) AS x
JOIN (SELECT idfk AS id, c, d
FROM t2 WHERE idfk = 1
) AS y
ON x.c = y.c AND x.d = y.d
GROUP BY x.id
HAVING COUNT(*) = (SELECT COUNT(*) FROM t2 WHERE t2.idfk = 1);
2
+ ROLLBACK;
原始答案
这至少引出了对问题的充分澄清。
据我所知,如果您有如下子查询:
SELECT t1.id AS id, t2.c, t2.d -- Query 1
FROM t1
JOIN t2 ON t1.id = t2.idfk
那么您正在结果集中寻找成对的行,其中c
和的值d
相同但id
值不同。因此,我们基于此编写主查询:
SELECT j1.id, j2.id -- Query 2
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j2
ON j1.c = j2.c AND j1.d = j2.d AND j1.id != j2.id
!=
您可以通过将条件更改为<
or来确保不会同时获得 '1, 2' 和 '2, 1' >
。
如果您希望行与 T1 中的特定 ID 值匹配,则可以在 WHERE 子句中指定它:
SELECT j2.id -- Query 3
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j2
ON j1.c = j2.c AND j1.d = j2.d AND j1.id != j2.id
WHERE j1.id = 1; -- 1 is the ID for which matches are sought
如果您愿意,您可以将条件添加到子查询中(尽管一个好的优化器可能会为您做到这一点):
SELECT j2.id -- Query 4
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk AND t1.id = 1
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk AND t1.id != 1
) AS j2
ON j1.c = j2.c AND j1.d = j2.d
WHERE j1.id = 1; -- 1 is the ID for which matches are sought
主 ON 子句中的第三个条件是多余的,因为根据构造,j1
子查询中的 ID 值全为 1,而子查询中的 ID 值j2
全为“非 1”。
我用SQL 中的t2.id
vs解决了这个问题t2.idfk
,并且我已经运行了上面的 4 个查询。每个都会产生我期望的答案。例如,查询 4 的结果集中有两行,因为 T1 中有两对行,使得 { 1, a , b } 和 { 2, a , b } 都存在于 T2 中。如果您只希望这两个出现一次,尽管有许多匹配的行,那么您需要将 DISTINCT 应用于 SELECT。
在评论中,你说:
不幸的是,即使其中一个属性不匹配,它仍然会返回结果。如何匹配 T2 中的每个属性?
这需要一个扩展的数据集来演示。当我添加:
INSERT INTO T1 VALUES(3, 'a', 'b');
INSERT INTO T2 VALUES(3, 'a', 'z');
INSERT INTO T2 VALUES(3, 'y', 'b');
值 3 只出现在查询 1 的结果中,这是它应该出现的唯一位置。
请说明您所看到的错误行为,并显示示例数据。我用下面的 SQL 和交错的查询结果测试了上面的查询。DBMS 是在 Mac OS X 10.7.4 上运行的 IBM Informix Dynamic Server 11.70.FC2,使用 SQLCMD v88.00 作为 SQL 命令解释器。
+ BEGIN;
+ CREATE TEMP TABLE T1
(ID INTEGER NOT NULL PRIMARY KEY, A CHAR(1) NOT NULL, B CHAR(1) NOT NULL);
+ INSERT INTO T1 VALUES(1, 'k', 'l');
+ INSERT INTO T1 VALUES(2, 'k', 'l');
+ INSERT INTO T1 VALUES(3, 'a', 'b');
+ CREATE TEMP TABLE T2
(IDFK INTEGER NOT NULL, C CHAR(1) NOT NULL, D CHAR(1) NOT NULL);
+ INSERT INTO T2 VALUES(1, 'w', 'x');
+ INSERT INTO T2 VALUES(1, 'y', 'z');
+ INSERT INTO T2 VALUES(2, 'w', 'x');
+ INSERT INTO T2 VALUES(2, 'y', 'z');
+ INSERT INTO T2 VALUES(3, 'a', 'z');
+ INSERT INTO T2 VALUES(3, 'y', 'b');
+ SELECT t1.id AS id, t2.c, t2.d -- Query 1
FROM t1
JOIN t2 ON t1.id = t2.idfk;
1|w|x
1|y|z
2|w|x
2|y|z
3|a|z
3|y|b
+ SELECT j1.id, j2.id -- Query 2
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j2
ON j1.c = j2.c AND j1.d = j2.d AND j1.id != j2.id;
1|2
1|2
2|1
2|1
+ SELECT j2.id -- Query 3
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk
) AS j2
ON j1.c = j2.c AND j1.d = j2.d AND j1.id != j2.id
WHERE j1.id = 1;
2
2
+ SELECT j2.id -- Query 4
FROM (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk AND t1.id = 1
) AS j1
JOIN (SELECT t1.id AS id, t2.c, t2.d
FROM t1
JOIN t2 ON t1.id = t2.idfk AND t1.id != 1
) AS j2
ON j1.c = j2.c AND j1.d = j2.d
WHERE j1.id = 1;
2
2
+ ROLLBACK;