我有一张这样的桌子:
ID, ItemsID
1 2
1 3
1 4
2 3
2 4
2 2
我想删除像ID=2
因为与我的情况2,3,4
相同的元组3,4,2
。
如何使用 SQL 做到这一点?
抱歉,我没有及时看到 Oracle 标签。但是,我将保留 MySQL 解决方案以供参考。显然GROUP_CONCAT()
在某些 Oracle 版本中有类似的东西。
它可能不是最优雅的解决方案,但这可以完成工作:
DELETE FROM t WHERE ID IN (
SELECT ID
FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) AS tuples
WHERE EXISTS (
SELECT TRUE
FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) tuples2
WHERE tuples2.tuple = tuples.tuple
AND tuples2.ID < tuples.ID
)
)
您可能需要调整group_concat_max_len。
另一种方法。Oracle 10gR1 或更高版本。在此示例中ItemsID
,s 1、2、6 具有相同的一组值,ID
3 具有另一个值,4 和 5 具有另一个值。因此,我们将删除ID
s 2、6 和 5,因为它们似乎是重复的,通过ItemsID
将特定ID
组的每组元素表示为嵌套表并使用multiset except
运算符确定组中的元素是否相同:
-- set-up
SQL> create table tb_table(
2 id number,
3 itemsid number);
Table created
SQL> insert into tb_table(id, itemsid)
2 select 1, 2 from dual union all
3 select 1, 3 from dual union all
4 select 1, 4 from dual union all
5 select 2, 4 from dual union all
6 select 2, 3 from dual union all
7 select 2, 2 from dual union all
8 select 3, 2 from dual union all
9 select 3, 3 from dual union all
10 select 3, 6 from dual union all
11 select 3, 4 from dual union all
12 select 4, 1 from dual union all
13 select 4, 2 from dual union all
14 select 4, 3 from dual union all
15 select 5, 1 from dual union all
16 select 5, 2 from dual union all
17 select 5, 3 from dual union all
18 select 6, 2 from dual union all
19 select 6, 4 from dual union all
20 select 6, 3 from dual;
19 rows inserted
SQL> commit;
Commit complete
SQL> create or replace type t_numbers as table of number;
2 /
Type created
-- contents of the table
SQL> select *
2 from tb_table;
ID ITEMSID
---------- ----------
1 2
1 3
1 4
2 4
2 3
2 2
3 2
3 3
3 6
3 4
4 1
4 2
4 3
5 1
5 2
5 3
6 2
6 4
6 3
19 rows selected
SQL> delete from tb_table
2 where id in (with DataGroups as(
3 select id
4 , grp
5 , (select count(*) from table(grp)) cnt
6 from (select id
7 , cast(collect(itemsid) as t_numbers) grp
8 from tb_table
9 group by id
10 )
11 )
12 select distinct id2
13 from ( select dg1.id as id1
14 , dg2.id as id2
15 , (dg1.grp multiset except dg2.grp) res
16 , dg1.cnt
17 from DataGroups Dg1
18 cross join DataGroups Dg2
19 where dg1.cnt = dg2.cnt
20 order by dg1.id
21 ) t
22 where res is empty
23 and id2 > id1
24 )
25 ;
9 rows deleted
SQL> select *
2 from tb_table;
ID ITEMSID
---------- ----------
1 2
1 3
1 4
3 2
3 3
3 6
3 4
4 1
4 2
4 3
10 rows selected
这是我能想到的最好的方法,但不知何故,我觉得必须有一个更简单的解决方案:
delete from items
where id in (
select id
from (
with counts as (
select id,
count(*) as cnt
from items
group by id
)
select c1.id, row_number() over (order by c1.id) as rn
from counts c1
join counts c2
on c1.id <> c2.id and c1.cnt = c2.cnt
and not exists (select i1.itemsid
from items i1
where i1.id = c1.id
minus
select i2.itemsid
from items i2
where i2.id = c2.id)
) t
where rn <> 1
);
它适用于任意数量的itemsid
值。
结合窗口定义中的rn <> 1
升序排序将保留表中的最小 id(在您的情况下为1
)。如果要保留最高的 ID 值,则需要将排序顺序更改为over (order by c1.id desc)