您所看到的是 IN () bieng 受绑定变量窥视的结果。当您有直方图时,您编写一个查询,例如“where a = 'a'”,oracle 将使用直方图来猜测将返回多少行(与 inlist 运算符相同的想法,它对每个项目进行迭代并聚合行)。如果没有直方图,它将以行/不同值的形式进行猜测。在子查询中,oracle 不会这样做(在大多数情况下......它有一个独特的情况)。
例如:
SQL> create table test
2 (val1 number, val2 varchar2(20), val3 number);
Table created.
Elapsed: 00:00:00.02
SQL>
SQL> insert into test select 1, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100;
100 rows created.
Elapsed: 00:00:00.01
SQL> insert into test select 2, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 1000;
1000 rows created.
Elapsed: 00:00:00.02
SQL> insert into test select 3, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100;
100 rows created.
Elapsed: 00:00:00.00
SQL> insert into test select 4, 'aaaaaaaaaa', mod(rownum, 5) from dual connect by level <= 100000;
100000 rows created.
所以我有一张有 101200 行的表。对于 VAL1 ,100 是“1”,1000 是“2”,100 是“3”,100k 是“4”。
现在如果收集直方图(在这种情况下我们确实想要它们)
SQL> exec dbms_stats.gather_table_stats(user , 'test', degree=>4, method_opt=>'for all indexed columns size 4', estimate_percent=>100);
SQL> exec dbms_stats.gather_table_stats(user , 'lookup', degree=>4, method_opt =>'for all indexed columns size 3', estimate_percent=>100);
我们看到以下内容:
SQL> explain plan for select * from test where val1 in (1, 2, 3) ;
Explained.
SQL> @explain ""
Plan hash value: 3165434153
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1200 | 19200 | 23 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| TEST | 1200 | 19200 | 23 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | TEST1 | 1200 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
对比
SQL> explain plan for select * from test where val1 in (select id from lookup where str = 'A') ;
Explained.
SQL> @explain ""
Plan hash value: 441162525
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25300 | 518K| 106 (3)| 00:00:02 |
| 1 | NESTED LOOPS | | 25300 | 518K| 106 (3)| 00:00:02 |
| 2 | TABLE ACCESS BY INDEX ROWID| LOOKUP | 1 | 5 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | LOOKUP1 | 1 | | 0 (0)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | TEST | 25300 | 395K| 105 (3)| 00:00:02 |
----------------------------------------------------------------------------------------
查找表在哪里
SQL> select * From lookup;
ID STR
---------- ----------
1 A
2 B
3 C
4 D
(str 是唯一索引并具有直方图)。
注意 inlist 的基数为 1200 和一个好的计划,但在子查询中却非常不准确?Oracle 没有计算连接条件的直方图,而是说“看,我不知道 id 会是什么,所以我猜总行数(100k+1000+100+100)/不同值(4)=25300 并使用那。因此,它选择了全表扫描。
这一切都很好,但如何解决呢?如果您知道此子查询将匹配少量行(我们会)。那么你必须提示外部查询以尝试让它使用索引。喜欢:
SQL> explain plan for select /*+ index(t) */ * from test t where val1 in (select id from lookup where str = 'A') ;
Explained.
SQL> @explain
Plan hash value: 702117913
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25300 | 518K| 456 (1)| 00:00:06 |
| 1 | NESTED LOOPS | | 25300 | 518K| 456 (1)| 00:00:06 |
| 2 | TABLE ACCESS BY INDEX ROWID| LOOKUP | 1 | 5 | 1 (0)| 00:00:01 |
|* 3 | INDEX UNIQUE SCAN | LOOKUP1 | 1 | | 0 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| TEST | 25300 | 395K| 455 (1)| 00:00:06 |
|* 5 | INDEX RANGE SCAN | TEST1 | 25300 | | 61 (2)| 00:00:01 |
----------------------------------------------------------------------------------------
另一件事是在我的特殊情况下。由于 val1=4 是表的大部分,假设我有我的标准查询:
select * from test t where val1 in (select id from lookup where str = :B1);
对于可能的:B1
输入。如果我知道传入的有效值是 A、B 和 C(即不是映射到 id=4 的 D)。我可以添加这个技巧:
SQL> explain plan for select * from test t where val1 in (select id from lookup where str = :b1 and id in (1, 2, 3)) ;
Explained.
SQL> @explain ""
Plan hash value: 771376936
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 5250 | 24 (5)| 00:00:01 |
|* 1 | HASH JOIN | | 250 | 5250 | 24 (5)| 00:00:01 |
|* 2 | VIEW | index$_join$_002 | 1 | 5 | 1 (100)| 00:00:01 |
|* 3 | HASH JOIN | | | | | |
|* 4 | INDEX RANGE SCAN | LOOKUP1 | 1 | 5 | 0 (0)| 00:00:01 |
| 5 | INLIST ITERATOR | | | | | |
|* 6 | INDEX UNIQUE SCAN | SYS_C002917051 | 1 | 5 | 0 (0)| 00:00:01 |
| 7 | INLIST ITERATOR | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| TEST | 1200 | 19200 | 23 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | TEST1 | 1200 | | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
现在注意 oracle 有一张合理的卡片(它把 1,2,3 推到 TEST 表上并得到 1200..不是 100% 准确,因为我只过滤了它们中的任何一个,但我告诉 orce 肯定不是 4!