9

我有一个包含 3 列的表格,如下所示:

one   |   two    |  three  |   name
------------------------------------
 A1       B1          C1        xyz
 A1       B1          C1        pqr      -> should be deleted
 A1       B1          C1        lmn      -> should be deleted
 A2       B2          C2        abc
 A2       B2          C2        def      -> should be deleted
 A3       B3          C3        ghi
------------------------------------ 

该表没有任何主键列。我对表没有任何控制权,因此无法添加任何主键列。

如图所示,我想删除一、二、三列组合相同的行。因此,如果 A1B1C1 出现三次(如上例所示),则应删除另外两个,只留下一个。

如何通过 DB2 中的一个查询来实现这一点?

我的要求是单个查询,因为我将通过 java 程序运行它。

4

7 回答 7

23

(假设您使用的是 DB2 for Linux/Unix/Windows,其他平台可能略有不同)

DELETE FROM
    (SELECT ROWNUMBER() OVER (PARTITION BY ONE, TWO, THREE) AS RN
     FROM SESSION.TEST) AS A
WHERE RN > 1;

应该能得到你想要的东西。

该查询使用OLAP 函数 为每个, ,组合ROWNUMBER()中的每一行分配一个数字。然后,DB2 能够将(A)引用的行匹配为语句应该从表中删除的行。为了能够使用 a作为删除子句的目标,它必须匹配可删除视图的规则(请参阅注释部分下的“可删除视图”)。ONETWOTHREEfullselectDELETEfullselect

以下是一些证明(在 LUW 9.7 上测试):

DECLARE GLOBAL TEMPORARY TABLE SESSION.TEST (
    one CHAR(2),
    two CHAR(2),
    three CHAR(2),
    name CHAR(3)
) ON COMMIT PRESERVE ROWS;

INSERT INTO SESSION.TEST VALUES 
    ('A1', 'B1', 'C1', 'xyz'),
    ('A1', 'B1', 'C1', 'pqr'),
    ('A1', 'B1', 'C1', 'lmn'),
    ('A2', 'B2', 'C2', 'abc'),
    ('A2', 'B2', 'C2', 'def'),
    ('A3', 'B3', 'C3', 'ghi');

DELETE FROM
    (SELECT ROWNUMBER() OVER (PARTITION BY ONE, TWO, THREE) AS RN
     FROM SESSION.TEST) AS A
WHERE RN > 1;

SELECT * FROM SESSION.TEST;

2017 年 3 月 2 日编辑:

针对Ahmed Anwar的问题,如果需要捕获被删除的内容,还可以将删除与“数据更改语句”结合起来。在此示例中,您可以执行以下操作,这将为您提供“ rn ”列、

SELECT * FROM OLD TABLE (
    DELETE FROM
        (SELECT 
             ROWNUMBER() OVER (PARTITION BY ONE, TWO, THREE) AS RN
            ,ONE
            ,TWO
            ,THREE
         FROM SESSION.TEST) AS A
    WHERE RN > 1
) OLD;
于 2012-04-10T13:03:11.197 回答
2
DELETE FROM the_table tt
WHERE EXISTS ( SELECT *
    FROM the_table ex
    WHERE ex.one = tt.one
    AND ex.two = tt.two
    AND ex.three = tt.three
    AND ex.zname < tt.zname -- tie-breaker...
    );

注意:您的 SQL 方言可能会有所不同。注2:“名称”在某些平台上是保留字。最好避免它。

于 2012-04-10T11:30:30.973 回答
1

@a_horse_with_no_name answer db2 for iseries 的变体,不使用 group by 子句和 in 子句。它确实有效

DELETE from the_table a 
where rrn(a) < (
select max(rrn(a)) from the_table b 
where a.one = b.one and a.two = b.two and a.three = b.three
)
于 2016-04-26T22:16:39.200 回答
0

这是 levenlevi 答案的变体,不需要表上的主键(现在无法测试语法)

DELETE FROM the_table
WHERE  rid_bit(the_table) NOT IN (SELECT MAX(rid_bit(the_table))
                                  FROM the_table
                                  GROUP BY one,two,three)

我认为在 iSeries 上rid_bit()不支持,但rrn()保存相同的目的

于 2012-04-10T13:11:43.807 回答
0
Please take backup of table before deleting the data

Delete from table where Name in (select name from table
group by one,two,three
having count(*) > 2)

您可以使用

     DELETE from TABLE Group by one,two,three Having count(*) > 2; 
于 2012-04-10T11:10:48.320 回答
0
DELETE  FROM Table_Name
WHERE   Table_Name_ID NOT IN ( SELECT  MAX(Table_Name_ID)
                                    FROM    Table_Name
                                    GROUP BY one ,
                                             two, 
                                             three )

一二三是你重复的列,Table_Name_ID 是 PK

于 2012-04-10T11:11:36.283 回答
0

对于其他使用非常旧版本的 db2 SQL 的人:这些帖子的组合有助于识别和删除两次发布的 2 个批次中的重复数据。

SELECT   * FROM     LIBRARY.TABLE a
WHERE    a.batch in (115131, 115287)
AND      EXISTS ( SELECT 1 from LIBRARY.TABLE d 
    WHERE d.batch in (115131, 115287)
     AND a.one = d.one AND a.two = d.two AND a.three = d.three 
    GROUP BY d.one, d.two, d.three 
    HAVING count(*) <> 1 )

    AND RRN(a) > (SELECT MIN(RRN(b)) FROM LIBRARY.TABLE b 
        WHERE b.batch in (115131, 115287)
        AND a.one = b.one AND a.two = b.two AND a.three = b.three );
于 2018-01-31T20:40:40.490 回答