1

我想从我的表中删除一些随机数据,并削减 90% 的行(我有非常大的数据,我只需要一个样本),表如下所示:

ID    |Trans_No    |Doctor_ID |Trans_Type                    |PM |Cost
12340 |10.853329   |          |ADMINISTRASI                  |   |0.00
12341 |10.853329   |1004      |JASA MEDIS                    |   |25000.00
12342 |10.853329   |          |OBAT RESEP FARMASI NO : 177   |F  |2000.00
12343 |10.836033   |          |ADMINISTRASI                  |   |0.00
12344 |10.836033   |1001      |JASA MEDIS                    |   |25000.00
12345 |10.836033   |          |OBAT RESEP FARMASI NO : 317   |F  |0.00
12346 |10.836032   |          |ADMINISTRASI                  |   |0.00
12347 |10.836032   |1004      |JASA MEDIS                    |   |25000.00
12348 |10.836032   |          |PEMERIKSAAN RADIOLOGI NO 092.1|R  |15000.00
12349 |10.836034   |1064      |JASA MEDIS                    |   |25000.00
12350 |10.836034   |          |PEMERIKSAAN RADIOLOGI NO 093.1|R  |20000.00

我认为这个查询会起作用:

DELETE FROM my_table WHERE RAND() <= 0.9

但是正如你所看到的,一些数据具有相同的trans_no,如果一个trans_no被删除,另一个具有相同trans_no的数据应该被删除,有什么查询可以做到这一点吗?

4

2 回答 2

3

您应该trans_no先选择符合条件的然后删除它们,如下所示:

DELETE FROM my_table
WHERE trans_no IN (
    SELECT trans_no
    FROM (SELECT DISTINCT trans_no FROM my_table) x
    WHERE rand() <= 0.9
)
于 2013-01-02T05:37:16.130 回答
2

您必须对此进行内部查询。

首先你必须找到trans_no要删除的

    SELECT 
            trans_no
    FROM 
           (SELECT DISTINCT trans_no FROM my_table)
    WHERE rand()

然后你必须有这样的单独删除查询

DELETE FROM my_table WHERE trans_no IN (Put the above query here)

最后

DELETE FROM my_table WHERE  trans_no IN (
        SELECT 
                trans_no
        FROM 
               (SELECT DISTINCT trans_no FROM my_table) as derived_table
        WHERE rand()
)
于 2013-01-02T05:49:37.607 回答