0

我希望有人可以解释加入多个表与使用 MINUS 消除记录的性能。我查看了其他一些堆栈溢出问题,但没有看到我在寻找什么。

我认为这两个查询会产生相同的输出,而且我一直听到“使用连接,使用连接!”,特别是从 stackoverflow 帖子中,他们预计会更快......

这是我运行的第一个查询,我认为它会慢得多,但运行只需几分钟...

select some_id
  from table1
MINUS
select some_id
  from table2
 where table2.value = 'some_value'
MINUS
select some_id
  from table3
 where table3.value = 'some_value'
 group by some_id

这是我认为会更快的第二个查询,但它现在已经运行了 3 个多小时(看不到尽头?)

select some_id
  from table1
       join table2 on table1.id=table2.id
       join table3 on table1.id=table3.id
 where table2.value = 'some_value'
    or table3.value = 'some_value'
 group by some_id

我应该注意所有 3 个表都有 > 100 万条记录,每个记录多达 1500 万条。

编辑

抱歉 - 我的意思是让你知道我避免在这个问题中使用 NOT EXISTS 作为回答,因为我真的对这两种情况感到好奇。

4

2 回答 2

0

试试这个版本:

select some_id
from table1
where not exists (select 1 from table2 t2 on t1.id = t2.id and t2.value = 'some_value') or
      not exists (select 1 from table3 t3 on t1.id = t3.id and t3.value = 'some_value')

table2(id, value)为了获得最佳性能,您需要在和上建立索引table3(id, value)

于 2015-04-16T20:09:22.237 回答
0

首先确保你有适当的索引,

查看计划,如果使用全表扫描,则继续创建索引,否则将需要很长时间。

如果您有 plsql 开发人员,则将查询粘贴到 in sql 窗口中,然后按 F5,它将为您提供解释计划。

或者也可以这样做,

SCOTT@research 17-APR-15>       EXPLAIN PLAN FOR
  2        select empno
  3            from emp
  4          MINUS
  5          select empno
  6            from empp
  7           where empp.empno = '7839'
  8          MINUS
  9          select empno
 10            from emppp
 11           where emppp.empno = '7902'
 12           group by empno
 13           ;

Explained.

SCOTT@research 17-APR-15> SET LINESIZE 130
SCOTT@research 17-APR-15> SET PAGESIZE 0
SCOTT@research 17-APR-15> SELECT *
  2  FROM   TABLE(DBMS_XPLAN.DISPLAY);
Plan hash value: 4222598102

---------------------------------------------------------------------------------
| Id  | Operation              | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |    14 |    82 |    10  (90)| 00:00:01 |
|   1 |  MINUS                 |        |       |       |            |          |
|   2 |   MINUS                |        |       |       |            |          |
|   3 |    SORT UNIQUE NOSORT  |        |    14 |    56 |     2  (50)| 00:00:01 |
|   4 |     INDEX FULL SCAN    | PK_EMP |    14 |    56 |     1   (0)| 00:00:01 |
|   5 |    SORT UNIQUE NOSORT  |        |     1 |    13 |     4  (25)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL  | EMPP   |     1 |    13 |     3   (0)| 00:00:01 |
|   7 |   SORT UNIQUE NOSORT   |        |     1 |    13 |     4  (25)| 00:00:01 |
|   8 |    SORT GROUP BY NOSORT|        |     1 |    13 |     4  (25)| 00:00:01 |
|*  9 |     TABLE ACCESS FULL  | EMPPP  |     1 |    13 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("EMPP"."EMPNO"=7839)
   9 - filter("EMPPP"."EMPNO"=7902)

Note
-----
   - dynamic sampling used for this statement (level=2)

26 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2137789089

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| DISPLAY |  8168 | 16336 |    29   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

或者如果你想使用自动跟踪然后做,

set autotrace on explain

这就是它的样子,

SCOTT@research 17-APR-15> select empno
  2    from emp
  3  MINUS
  4  select empno
  5    from empp
  6   where empp.empno = '7839'
  7  MINUS
  8  select empno
  9    from emppp
 10   where emppp.empno = '7902'
 11   group by empno
 12   ;

     EMPNO
----------
       234
      7499
      7521
      7566
      7654
      7698
      7782
      7788
      7844
      7876
      7900
      7934

12 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 4222598102

---------------------------------------------------------------------------------
| Id  | Operation              | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |        |    14 |    82 |    10  (90)| 00:00:01 |
|   1 |  MINUS                 |        |       |       |            |          |
|   2 |   MINUS                |        |       |       |            |          |
|   3 |    SORT UNIQUE NOSORT  |        |    14 |    56 |     2  (50)| 00:00:01 |
|   4 |     INDEX FULL SCAN    | PK_EMP |    14 |    56 |     1   (0)| 00:00:01 |
|   5 |    SORT UNIQUE NOSORT  |        |     1 |    13 |     4  (25)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL  | EMPP   |     1 |    13 |     3   (0)| 00:00:01 |
|   7 |   SORT UNIQUE NOSORT   |        |     1 |    13 |     4  (25)| 00:00:01 |
|   8 |    SORT GROUP BY NOSORT|        |     1 |    13 |     4  (25)| 00:00:01 |
|*  9 |     TABLE ACCESS FULL  | EMPPP  |     1 |    13 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("EMPP"."EMPNO"=7839)
   9 - filter("EMPPP"."EMPNO"=7902)

Note
-----
   - dynamic sampling used for this statement (level=2)

SCOTT@research 17-APR-15>



SCOTT@research 17-APR-15> select emp.empno
  2    from emp
  3         join empp on emp.empno=empp.empno
  4         join emppp on emp.empno=emppp.empno
  5   where empp.empno = '7839'
  6      or emppp.empno = '7902'
  7   group by emp.empno
  8  ;

     EMPNO
----------
      7839
      7902


Execution Plan
----------------------------------------------------------
Plan hash value: 1435156579

-------------------------------------------------------------------------------
| Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |        |     1 |    30 |     8  (25)| 00:00:01 |
|   1 |  HASH GROUP BY       |        |     1 |    30 |     8  (25)| 00:00:01 |
|*  2 |   HASH JOIN          |        |     1 |    30 |     7  (15)| 00:00:01 |
|   3 |    NESTED LOOPS      |        |     6 |   102 |     3   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMPPP  |     6 |    78 |     3   (0)| 00:00:01 |
|*  5 |     INDEX UNIQUE SCAN| PK_EMP |     1 |     4 |     0   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL | EMPP   |    10 |   130 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMP"."EMPNO"="EMPP"."EMPNO")
       filter("EMPP"."EMPNO"=7839 OR "EMPPP"."EMPNO"=7902)
   5 - access("EMP"."EMPNO"="EMPPP"."EMPNO")

Note
-----
   - dynamic sampling used for this statement (level=2)
于 2015-04-17T02:09:59.737 回答