0

如何比较 sql 中的元组组:考虑以下示例:

TABLE T1
--------
GROUP     VALUE
-----     -----
A         FOO
A         BAR
X         HHH
X         ZOO

TABLE T2
--------
GROUP     VALUE
-----     -----
B         ZOO
C         FOO
C         BAR

我想编写一个 sql 查询来比较两个表中的值组并报告差异。在图示示例中,表 a 中的组:((A,FOO),(A,BAR)) 与组 ((C,FOO),(C,BAR)) 相同,即使组名不同. 重要的是组的内容是相同的。最后,查询将报告存在差异:它是 (B,ZOO) 元组。

RESULT
------
GROUP     VALUE
-----     -----
B         ZOO
X         HHH
X         ZOO

尽管 T1 中包含 ZOO 的组 X 在 T2 中具有匹配值:(B,ZOO) 它仍然不是匹配项,因为该组还具有不属于 (B, ZOO) 组的 (X, HHH) 值在T2

4

2 回答 2

1

像这样的东西

create table t1 (group_id varchar2(20), value varchar2(20));
create table t2 (group_id varchar2(20), value varchar2(20));

insert into t1 values ('A','FOO');
insert into t1 values ('A','BAR');
insert into t1 values ('X','HHH');
insert into t1 values ('X','ZOO');
insert into t2 values ('C','FOO');
insert into t2 values ('C','BAR');
insert into t2 values ('B','ZOO');


select t1.group_id t1_group,t2.group_id t2_group, 
      --t1.all_val, t2.all_val, 
       case when t1.all_val = t2.all_val then 'match' else 'no match' end coll_match
from 
  (select 'T1' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t1 group by group_id) t1
full outer join
  (select 'T2' tab_id, group_id, collect(value) all_val, 
          min(value) min_val, max(value) max_val, count(distinct value) cnt_val 
  from t2 group by group_id) t2
on t1.min_val = t2.min_val and t1.max_val = t2.max_val and t1.cnt_val = t2.cnt_val
/

我已经根据每组中不同值的最小值、最大值和数量进行了初步消除,这将有助于处理大型数据集。如果数据集足够小,您可能不需要它们。

这告诉你比赛。您只需将其推出一个额外的步骤即可找到没有任何匹配项的组

select t1_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t1_group
having min(coll_match) is null
/

select t2_group
from
(
  select t1.group_id t1_group,t2.group_id t2_group, 
        --t1.all_val, t2.all_val, 
         case when t1.all_val = t2.all_val then 'match' end coll_match
  from 
    (select 'T1' tab_id, group_id, collect(value) all_val
    from t1 group by group_id) t1
  cross join
    (select 'T2' tab_id, group_id, collect(value) all_val
    from t2 group by group_id) t2
)
group by t2_group
having min(coll_match) is null
/
于 2009-12-03T22:12:58.467 回答
0

T1 和 T2(两个表)之间的差异可能是这样的:

SELECT
   T1.GROUPNAME,
   T1.VALUE
FROM 
   T1
LEFT JOIN T2
ON T2.Value = T1.Value
WHERE T2.GROUPNAME IS NULL

例如 T1 有:

Foo 100 酒吧 200 ZZZ 333

T2包括:Foo 100 Bar 200

该查询的结果是 ZZZ 333,它是两个表中唯一不匹配的记录。您甚至可以将 T2 的组名更改为:

XYZ 100 ZXZ 200

结果仍然是 ZZZ 333。这就是您所要求的,如果您想要相反的结果,您可以 UNION 到它,或者使用 RIGHT join。

乔恩

于 2009-12-03T17:30:26.183 回答