4

听说ORs 不好,有多个ORs 可能会显着影响性能。但是与行无关OR的 s 呢?看一下这个例子:

SELECT
  *
FROM
  some_table t
WHERE
  (
    some_function('CONTEXT') = 'context of selecting by id'
    AND t.id = TO_NUMBER(another_function('ID'))
  )
  OR (
    some_function('CONTEXT') = 'context of filtering by name'
    AND t.name LIKE '%' || another_function('NAME') || '%'
  )
  OR (
    some_function('CONTEXT') = 'context of taking actual rows'
    AND TO_DATE(another_function('ACTUAL_DATE'), '...')
        BETWEEN t.start_date AND t.end_date
  )
  ...

无论行如何,这里some_function('CONTEXT')都返回相同的值(它不使用任何与行相关的数据,例如列值作为其参数,并且它不会在查询执行时更改其影响结果的内部状态)。它也可以只是一个包变量,如some_package.context.
我认为,优化器应该some_function('CONTEXT')先计算,然后再决定采用哪一个OR
但实际上会发生什么?我如何确定这样的查询不会导致性能泄漏?

PS:11.2

4

2 回答 2

2

您需要使用未记录的提示use_concat(or_predicates(1))或使用UNION ALL. 无论函数如何,优化器都会对这些类型的谓词产生问题。

预期计划

你想要一个看起来像这样的计划:

------------------------------------------------------
| Id  | Operation                     | Name         |
------------------------------------------------------
|   0 | SELECT STATEMENT              |              |
|   1 |  CONCATENATION                |              |
|*  2 |   FILTER                      |              |
|*  3 |    TABLE ACCESS FULL          | SOME_TABLE   |
|*  4 |   FILTER                      |              |
|*  5 |    TABLE ACCESS FULL          | SOME_TABLE   |
|*  6 |   FILTER                      |              |
|*  7 |    TABLE ACCESS BY INDEX ROWID| SOME_TABLE   |
|*  8 |     INDEX UNIQUE SCAN         | SYS_C0010268 |
------------------------------------------------------

这与解释计划部分中FILTER的典型情况Operation非常不同。这些s 将评估条件并决定在运行时使用执行计划的哪一部分。根据传递给函数的值,计划将使用全表扫描(对于名称或日期的非选择性谓词)或使用唯一索引扫描(对于 id 的非常选择性的谓词)。filterPredicate InformationFILTER

这正是您想要的查询。如果查询只有少量的ANDs 和ORs,则可能会有FILTER.

实际计划

但实际上,如果有一个复杂的谓词,计划看起来像这样:

----------------------------------------
| Id  | Operation         | Name       |
----------------------------------------
|   0 | SELECT STATEMENT  |            |
|*  1 |  TABLE ACCESS FULL| SOME_TABLE |
----------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name' 
              AND "T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' OR 
              "SOME_FUNCTION"('CONTEXT')='context of taking actual rows' AND 
              "T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND 
              "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') OR 
              "SOME_FUNCTION"('CONTEXT')='context of selecting by id' AND 
              "T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID')))

全表扫描并不总是坏事。但是它们对于选择单个主键值非常糟糕。

示例架构

创建一个表和 100 万个样本行。有些列具有高度选择性,而有些则非常不具有选择性。它们都有直方图,因此优化器有很多有用的信息可供使用。

drop table some_table purge;

create table some_table
(
    id          number primary key,
    name        varchar2(100),
    start_date  date,
    end_date    date
);

begin
    for i in 1 .. 10 loop
        insert into some_table
        select 
            level+(i*100000),
            'Name '||mod(level, 5),
            date '2000-01-01' + mod(level, 10000),
            date '2010-01-01' + mod(level, 10000)
        from dual
        connect by level <= 100000;
    end loop;
end;
/
begin
    dbms_stats.gather_table_stats(user, 'SOME_TABLE'
        ,method_opt => 'for all columns size 254');
end;
/

示例函数

这些函数是非常静态的,优化器应该知道这一点。此示例以some_function永远不会匹配任何内容的方式使用。这是一种最好的情况;Oracle 应该很容易发现这个查询不会返回任何内容。

--Static functions.
create or replace function some_function(p_context in varchar2) return varchar2 is
begin
    return p_context;
end;
/
--Btw, returning stringly-typed data is almost always a horrible idea.
--(Althogh if you're dealing with sys_context you may not have a choice.)
create or replace function another_function(p_type in varchar2) return varchar2 is
begin
    if p_type = 'ID' then
        return '1';
    elsif p_type = 'NAME' then
        return 'Name 1';
    elsif p_type = 'ACTUAL_DATE' then
        return '2000-01-01';
    end if;
end;
/

默认 - 没有 FILTER 操作的错误计划

默认计划很差。查询应该在几乎 0 秒内运行,但必须执行全表扫描。

explain plan for
SELECT * FROM some_table t
WHERE
  (
    some_function('CONTEXT') = 'context of selecting by id'
    AND t.id = TO_NUMBER(another_function('ID'))
  )
  OR (
    some_function('CONTEXT') = 'context of filtering by name'
    AND t.name LIKE '%' || another_function('NAME') || '%'
  )
  OR (
    some_function('CONTEXT') = 'context of taking actual rows'
    AND TO_DATE(another_function('ACTUAL_DATE'), '...')
        BETWEEN t.start_date AND t.end_date
  );

select * from table(dbms_xplan.display);

Plan hash value: 3038250352

--------------------------------------------------------------------------------
| Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |   525 | 14700 |  1504  (17)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SOME_TABLE |   525 | 14700 |  1504  (17)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name' 
              AND "T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' OR 
              "SOME_FUNCTION"('CONTEXT')='context of taking actual rows' AND 
              "T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND 
              "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') OR 
              "SOME_FUNCTION"('CONTEXT')='context of selecting by id' AND 
              "T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID')))

use_concat(or_predicates(1)) - 使用过滤器的好计划

USE_CONCAT提示会将查询转换为单独的UNION ALL步骤。然后每个谓词都很简单并且有一个FILTER操作。不幸USE_CONCAT的是有一些奇怪的限制。有时它仅在使用索引时才有效(请参阅 My Oracle Support 文档 259741.1)。有时它根本不起作用,变通方法不起作用,并且在 12c 中仍未修复(文档 14545269.8)。

添加or_predicates(1)使其工作,但它完全没有记录。

explain plan for
SELECT --+ use_concat(or_predicates(1))
  *
FROM some_table t
WHERE
  (
    some_function('CONTEXT') = 'context of selecting by id'
    AND t.id = TO_NUMBER(another_function('ID'))
  )
  OR (
    some_function('CONTEXT') = 'context of filtering by name'
    AND t.name LIKE '%' || another_function('NAME') || '%'
  )
  OR (
    some_function('CONTEXT') = 'context of taking actual rows'
    AND TO_DATE(another_function('ACTUAL_DATE'), '...')
        BETWEEN t.start_date AND t.end_date
  );

select * from table(dbms_xplan.display);

Plan hash value: 1618041905

----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |              | 52500 |  1435K|  2721   (8)| 00:00:01 |
|   1 |  CONCATENATION                |              |       |       |            |          |
|*  2 |   FILTER                      |              |       |       |            |          |
|*  3 |    TABLE ACCESS FULL          | SOME_TABLE   |  2500 | 70000 |  1362   (8)| 00:00:01 |
|*  4 |   FILTER                      |              |       |       |            |          |
|*  5 |    TABLE ACCESS FULL          | SOME_TABLE   | 49999 |  1367K|  1356   (7)| 00:00:01 |
|*  6 |   FILTER                      |              |       |       |            |          |
|*  7 |    TABLE ACCESS BY INDEX ROWID| SOME_TABLE   |     1 |    28 |     3   (0)| 00:00:01 |
|*  8 |     INDEX UNIQUE SCAN         | SYS_C0010269 |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("SOME_FUNCTION"('CONTEXT')='context of taking actual rows')
   3 - filter("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...') AND 
              "T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...'))
   4 - filter("SOME_FUNCTION"('CONTEXT')='context of filtering by name')
   5 - filter("T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%' AND 
              (LNNVL("SOME_FUNCTION"('CONTEXT')='context of taking actual rows') OR 
              LNNVL("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')) OR 
              LNNVL("T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...'))))
   6 - filter("SOME_FUNCTION"('CONTEXT')='context of selecting by id')
   7 - filter((LNNVL("SOME_FUNCTION"('CONTEXT')='context of filtering by name') OR 
              LNNVL("T"."NAME" LIKE '%'||"ANOTHER_FUNCTION"('NAME')||'%')) AND 
              (LNNVL("SOME_FUNCTION"('CONTEXT')='context of taking actual rows') OR 
              LNNVL("T"."START_DATE"<=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...')) OR 
              LNNVL("T"."END_DATE">=TO_DATE("ANOTHER_FUNCTION"('ACTUAL_DATE'),'...'))))
   8 - access("T"."ID"=TO_NUMBER("ANOTHER_FUNCTION"('ID')))

UNION ALL - 过滤器的好计划

手动扩展查询可能是一种更安全的方法。但根据查询的复杂程度,它可能会变得非常难看。

explain plan for
SELECT * FROM some_table t
WHERE some_function('CONTEXT') = 'context of selecting by id' AND t.id = TO_NUMBER(another_function('ID'))
union all
SELECT * FROM some_table t
WHERE some_function('CONTEXT') = 'context of filtering by name' AND t.name LIKE '%' || another_function('NAME') || '%'
union all
SELECT * FROM some_table t
WHERE some_function('CONTEXT') = 'context of taking actual rows' AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date

select * from table(dbms_xplan.display);

(Plan not shown - it's basically the same as the `USE_CONCAT` version.)

案例 - 没有过滤器的糟糕计划

将谓词重写为单个CASE是一个好主意,但在这里似乎不起作用。尽管这可能只是我的具体示例的问题。

explain plan for
SELECT *
FROM some_table t
WHERE
    case
    when some_function('CONTEXT') = 'context of selecting by id'
        AND t.id = TO_NUMBER(another_function('ID')) then 1
    when some_function('CONTEXT') = 'context of filtering by name'
        AND t.name LIKE '%' || another_function('NAME') || '%' then 1
    when some_function('CONTEXT') = 'context of taking actual rows'
        AND TO_DATE(another_function('ACTUAL_DATE'), '...') BETWEEN t.start_date AND t.end_date then 1
    else 0 end
    = 1;

select * from table(dbms_xplan.display);

(Plan not shown - it's basically the same as the default version with the full table scan.)
于 2013-08-23T05:38:22.250 回答
1

你是对的——这就是优化器应该做的。但根据我的经验,这不是它的作用。

奇怪的是,你仍然可以得到你想要的这种情况下的行为——如果你将你的谓词转换为一个 case 语句,如下所示:

case
    when some_function('CONTEXT') = 'context of selecting by id'
    AND t.id = TO_NUMBER(another_function('ID')
    then 1 -- satisfied

    when some_function('CONTEXT') = 'context of filtering by name'
    AND t.name LIKE '%' || another_function('NAME') || '%'
    then 1 -- satisfied

    when some_function('CONTEXT') = 'context of taking actual rows'
    AND TO_DATE(another_function('ACTUAL_DATE'), '...')
        BETWEEN t.start_date AND t.end_date
    then 1 -- satisfied

    ...

    else 0 -- unsatisfied

end = 1  -- rows from candidate set are only in the result set when
         -- they are "satisfied"

然后,Oracle 通常会将此作为过滤操作而不是联合来解决,这将防止人们通过使用逻辑 OR 经常遇到的“常见”性能问题。

作为奖励,此方法通常也适用于“some_function(...)”的非行静态上下文!

于 2013-08-22T18:56:29.130 回答