sql - 如何比较sql中的元组组

Question

如何比较 sql 中的元组组：考虑以下示例：

TABLE T1
--------
GROUP     VALUE
-----     -----
A         FOO
A         BAR
X         HHH
X         ZOO

TABLE T2
--------
GROUP     VALUE
-----     -----
B         ZOO
C         FOO
C         BAR

我想编写一个 sql 查询来比较两个表中的值组并报告差异。在图示示例中，表 a 中的组：((A,FOO),(A,BAR)) 与组 ((C,FOO),(C,BAR)) 相同，即使组名不同. 重要的是组的内容是相同的。最后，查询将报告存在差异：它是 (B,ZOO) 元组。

RESULT
------
GROUP     VALUE
-----     -----
B         ZOO
X         HHH
X         ZOO

尽管 T1 中包含 ZOO 的组 X 在 T2 中具有匹配值：(B,ZOO) 它仍然不是匹配项，因为该组还具有不属于 (B, ZOO) 组的 (X, HHH) 值在T2

score 1 · Accepted Answer

像这样的东西

create table t1 (group_id varchar2(20), value varchar2(20));
create table t2 (group_id varchar2(20), value varchar2(20));

insert into t1 values ('A','FOO');
insert into t1 values ('A','BAR');
insert into t1 values ('X','HHH');
insert into t1 values ('X','ZOO');
insert into t2 values ('C','FOO');
insert into t2 values ('C','BAR');
insert into t2 values ('B','ZOO');


select t1.group_id t1_group,t2.group_id t2_group, 
      --t1.all_val, t2.all_val, 
       case when t1.all_val = t2.all_val then 'match' else 'no match' end coll_match
from 
  (select 'T1' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t1 group by group_id) t1
full outer join
  (select 'T2' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t2 group by group_id) t2
on t1.min_val = t2.min_val and t1.max_val = t2.max_val and t1.cnt_val = t2.cnt_val
/

我已经根据每组中不同值的最小值、最大值和数量进行了初步消除，这将有助于处理大型数据集。如果数据集足够小，您可能不需要它们。

这告诉你比赛。您只需将其推出一个额外的步骤即可找到没有任何匹配项的组

select t1_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t1_group
having min(coll_match) is null
/

select t2_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t2_group
having min(coll_match) is null
/

score 0 · Accepted Answer

T1 和 T2（两个表）之间的差异可能是这样的：

SELECT
   T1.GROUPNAME,
   T1.VALUE
FROM 
   T1
LEFT JOIN T2
ON T2.Value = T1.Value
WHERE T2.GROUPNAME IS NULL

例如 T1 有：

Foo 100 酒吧 200 ZZZ 333

T2包括：Foo 100 Bar 200

该查询的结果是 ZZZ 333，它是两个表中唯一不匹配的记录。您甚至可以将 T2 的组名更改为：

XYZ 100 ZXZ 200

结果仍然是 ZZZ 333。这就是您所要求的，如果您想要相反的结果，您可以 UNION 到它，或者使用 RIGHT join。

乔恩

sql - 如何比较sql中的元组组

2 回答 2

Related

Reference