在下面的示例中,Oracle 优化器的估计行错误了两个数量级。如何改进估计的行数?
对于 10 个字母 A 到 J 中的每一个,表A
中的行的数字为 1 到 1,000。
表C
有 100 个表的副本A
。
因此,表A
的基数为 10K,表 C 的基数为 1M。
对 table 中的数字的给定单值谓词A
将产生 table 中行的 1/1000 A
(对于 table 相同C
)。
table 中字母的给定单值谓词A
将产生 table 中行的 1/10 A
(与table 相同C
)。
设置脚本。
drop table C;
drop table A;
create table A
( num NUMBER
, val VARCHAR2(3 byte)
, pad CHAR(40 byte)
)
;
insert /*+ append enable_parallel_dml parallel (auto) */
into A (num, val, pad)
select mod(level-1, 1000) +1
, chr(mod(ceil(level/1000) - 1, 10) + ascii('A'))
, ' '
from dual
connect by level <= 10*1000
;
create table C
( id NUMBER
, num NUMBER
, val VARCHAR2(3 byte)
, pad CHAR(40 byte)
)
;
insert /*+ append enable_parallel_dml parallel (auto) */
into C (id, num, val, pad)
with
"D1" as
( select /*+ materialize */ null from dual connect by level <= 100 --320
)
, "D" as
( select /*+ materialize */
level rn
, mod(level-1, 1000) + 1 num
, chr(mod(ceil(level/1000) - 1, 10) + ascii('A')) val
, ' ' pad
from dual
connect by level <= 10*1000
order by 1 offset 0 rows
)
select rownum id
, num num
, val val
, pad pad
from "D1", "D"
;
commit;
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'A', cascade => true);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'C', cascade => true);
考虑以下查询的解释计划。
select *
from A
join C
on A.num = C.num
and A.val = C.val
where A.num = 1
and A.val = 'A'
;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 9900 | 2209 (1)| 00:00:01 |
|* 1 | HASH JOIN | | 100 | 9900 | 2209 (1)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| A | 1 | 47 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| C | 100 | 5200 | 2185 (1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."NUM"="C"."NUM" AND "A"."VAL"="C"."VAL")
2 - filter("A"."NUM"=1 AND "A"."VAL"='A')
3 - filter("C"."NUM"=1 AND "C"."VAL"='A')
每个步骤的行基数对我来说很有意义。
ID=2 --> (1/1,000) * (1/10) * 10,000 = 1
ID=3 --> (1/1,000) * (1/10) * 1,000,000 = 100
ID=1 --> 100 是正确的。ID=2 和 ID=3 中的谓词是相同的,ID=2 的每一行在 ID=3 的行源中只有一个匹配项。
现在考虑下面稍微修改的查询的解释计划。
select *
from A
join C
on A.num = C.num
and A.val = C.val
where A.num in(1,2)
and A.val = 'A'
;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 198 | 2209 (1)| 00:00:01 |
|* 1 | HASH JOIN | | 2 | 198 | 2209 (1)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| A | 2 | 94 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| C | 200 | 10400 | 2185 (1)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."NUM"="C"."NUM" AND "A"."VAL"="C"."VAL")
2 - filter("A"."VAL"='A' AND ("A"."NUM"=1 OR "A"."NUM"=2))
3 - filter("C"."VAL"='A' AND ("C"."NUM"=1 OR "C"."NUM"=2))
每个步骤 ID=2 和 ID=3 的行基数对我来说是有意义的,但现在 ID=1 错误了两个数量级。
ID=2 --> (1/1,000)(1/10) * 10,000 = 1
ID=3 --> (1/1,000)(1/10) * 1,000,000 = 100
ID=1 --> 优化器的估计与实际相差两个数量级。
添加唯一和外部约束以及扩展统计信息并没有提高估计的行数。
create unique index IU_A on A (num, val);
alter table A add constraint UK_A unique (num, val) rely using index IU_A enable validate;
alter table C add constraint R_C foreign key (num, val) references A (num, val) rely enable validate;
create index IR_C on C (num, val);
select dbms_stats.create_extended_stats(null,'A','(num, val)') from dual;
select dbms_stats.create_extended_stats(null,'C','(num, val)') from dual;
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'A', cascade => true);
exec dbms_stats.gather_table_stats(OwnName => null, TabName => 'C', cascade => true);
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 198 | 10 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 2 | 198 | 10 (0)| 00:00:01 |
| 3 | INLIST ITERATOR | | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| A | 2 | 94 | 5 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | IU_A | 2 | | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | IR_C | 1 | | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID | C | 1 | 52 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access(("A"."NUM"=1 OR "A"."NUM"=2) AND "A"."VAL"='A')
6 - access("A"."NUM"="C"."NUM" AND "C"."VAL"='A')
filter("C"."NUM"=1 OR "C"."NUM"=2)
我需要做什么才能使估计的行更好地匹配现实?
使用 Oracle 企业版 19c。
提前致谢。
编辑 在确保使用了最新的 optimizer_features_enable 并修改了其中一个谓词之后,我们仍然有一个解释计划,其估计的行数减少了两个数量级。
ID=6 估计应该有 100 行。似乎它应用了谓词因子两次。一次用于访问,再次用于过滤器。
select /*+ optimizer_features_enable('19.1.0') */
*
from A
join C
on A.num = C.num
and A.val = C.val
where A.num in(1,2)
and A.val in('A','B')
;
-----------------------------------------------------------------------------------------------
| id | Operation | name | rows | Bytes | cost (%CPU)| time |
-----------------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 396 | 16 (0)| 00:00:01 |
| 1 | nested LOOPS | | 4 | 396 | 16 (0)| 00:00:01 |
| 2 | nested LOOPS | | 4 | 396 | 16 (0)| 00:00:01 |
| 3 | INLIST ITERATOR | | | | | |
| 4 | table access by index ROWID BATCHED| A | 4 | 188 | 7 (0)| 00:00:01 |
|* 5 | index range scan | IU_A | 4 | | 3 (0)| 00:00:01 |
|* 6 | index range scan | IR_C | 1 | | 2 (0)| 00:00:01 |
| 7 | table access by index ROWID | C | 1 | 52 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("A"."NUM"=1 or "A"."NUM"=2)
filter("A"."VAL"='A' or "A"."VAL"='B')
6 - access("A"."NUM"="C"."NUM" and "A"."VAL"="C"."VAL")
filter(("C"."NUM"=1 or "C"."NUM"=2) and ("C"."VAL"='A' or "C"."VAL"='B'))