0

我有一个查询需要几个小时才能执行,有时甚至没有执行。查询如下:

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0;

查询结果集

SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null

如下:

ID       MULTILIST01 
295285  ,3434925,3434442,3436781,
212117  ,3434925,3434442,3436781,
212120  ,3434925,3434442,3436781,
6031650 ,3436781,
.
.
.

在外部查询中,我试图将每个逗号分隔值设为唯一值。当我执行外部查询时,需要几个小时才能执行。我试过优化它,但没有用。

知道如何优化它。

Oracle 版本信息

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE    12.1.0.2.0  Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

解释表信息

Plan hash value: 4097679000

------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |  1429 | 11432 |   840   (2)| 00:00:01 |
|*  1 |  FILTER                       |          |       |       |            |          |
|*  2 |   CONNECT BY WITHOUT FILTERING|          |       |       |            |          |
|*  3 |    TABLE ACCESS FULL          | PAGE_TWO |  1429 | 11432 |   840   (2)| 00:00:01 |
------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$F5BB74E1
   3 - SEL$F5BB74E1 / PAGE_TWO@SEL$2

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TRIM( REGEXP_SUBSTR ("MULTILIST01",'[^,]+',1,LEVEL)) IS NOT NULL)
   2 - filter(INSTR("MULTILIST01",',',1,LEVEL-1)>0)
   3 - filter("MULTILIST01" IS NOT NULL)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   2 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020], LEVEL[4]
   3 - "ID"[NUMBER,22], "MULTILIST01"[VARCHAR2,1020]

表包含 225 列,其中索引仅在主键列(ID、CLASS)上。

此表属于 Agile PLM。

4

1 回答 1

2

您的方法适用于只有一行的表格。

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 1 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;

        ID STR                     
---------- -------------------------
    295285 3434442                   
    295285 3434925                   
    295285 3436781  

从两行表开始,您(可能)会得到更多预期的结果:

SELECT id, trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null and rownum <= 2 )
WHERE trim(regexp_substr(str, '[^,]+', 1, LEVEL)) is not null
CONNECT BY instr(str, ',', 1, LEVEL -1) > 0
order by 1,2;   

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    212117 3436781                   
    295285 3434442                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781                   
    295285 3436781        
;

查询的这种重新制定将得到您(可能)想要的东西。使用提供子字符串索引的子查询 (1 ..N)。您必须定义要拆分的子字符串的最大数量。将此表与您的表连接起来,以有效地将行乘以 N。

with substr_idx as (
select  rownum colnum from dual connect by level <= 3 /*  max  number of substrings */)   
SELECT id, trim(regexp_substr(str, '[^,]+', 1, colnum)) str
FROM (SELECT id , MULTILIST01 as str from PAGE_TWO where MULTILIST01 is not null), substr_idx
WHERE trim(regexp_substr(str, '[^,]+', 1, colnum)) is not null
order by 1,2;  

        ID STR                     
---------- -------------------------
    212117 3434442                   
    212117 3434925                   
    212117 3436781                   
    212120 3434442                   
    212120 3434925                   
    212120 3436781                   
    295285 3434442                   
    295285 3434925                   
    295285 3436781                   
   6031650 3436781 

如果您将正则表达式替换为 substr / instr 提取,则有望进一步(次要)提高性能。参见例如这里

这个故事的一个寓意是,如果您没有使用大数据获得结果,请尝试使用小数据(检查rownum <= 2上面的限制)并查看结果是否符合预期。

于 2016-05-17T10:18:45.073 回答