你的陈述对我来说似乎很好。
在任何优化任务中,不要考虑模式。不要想,“(not) exists
又坏又慢,(not) in
超级酷又快”。
想一想,数据库在每个步骤中做了多少工作,你如何衡量它?
一个简单的例子:
-- 不在:
23:59:41 HR@sandbox> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:00.03
23:59:43 HR@sandbox> set autotrace traceonly explain statistics
23:59:49 HR@sandbox> select country_id from countries where country_id not in (select country_id from locations);
11 rows selected.
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 1748518851
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 4 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | NESTED LOOPS ANTI SNA| | 11 | 66 | 4 (75)| 00:00:01 |
| 3 | INDEX FULL SCAN | COUNTRY_C_ID_PK | 25 | 75 | 1 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | LOC_COUNTRY_IX | 13 | 39 | 0 (0)| 00:00:01 |
|* 5 | TABLE ACCESS FULL | LOCATIONS | 1 | 3 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "LOCATIONS" "LOCATIONS" WHERE
"COUNTRY_ID" IS NULL))
4 - access("COUNTRY_ID"="COUNTRY_ID")
5 - filter("COUNTRY_ID" IS NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
8 physical reads
0 redo size
446 bytes sent via SQL*Net to client
363 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
- 不存在
23:59:57 HR@sandbox> alter system flush buffer_cache;
System altered.
Elapsed: 00:00:00.17
00:00:02 HR@sandbox> select country_id from countries c where not exists (select 1 from locations l where l.country_id = c.country_id );
11 rows selected.
Elapsed: 00:00:00.30
Execution Plan
----------------------------------------------------------
Plan hash value: 840074837
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 66 | 1 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI| | 11 | 66 | 1 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN | COUNTRY_C_ID_PK | 25 | 75 | 1 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| LOC_COUNTRY_IX | 13 | 39 | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("L"."COUNTRY_ID"="C"."COUNTRY_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
5 consistent gets
2 physical reads
0 redo size
446 bytes sent via SQL*Net to client
363 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
在此示例中,NOT IN 读取两倍的数据库块并执行更复杂的过滤 - 问问自己,为什么选择它而不是 NOT EXISTS?