6

Oracle MAX函数的时间复杂度是O(1)、O(log n) 还是 O(n) 相对于表中的行数?

4

2 回答 2

10

如果您在该列上有一个 B 树索引,则找到最大值是 O(log(n)),因为答案将是索引的最后(或第一)行。值存储在高度为 O(log(n)) 的 B 树的最深节点中。

没有索引它是 O(n) 因为必须读取所有行以确定最大值。


注意:O(n) 表示法会忽略常量,但在现实世界中,这些常量不能被忽略。从磁盘读取和从内存读取之间的差异是几个数量级。访问索引的第一个值可能主要在 RAM 中执行,而对巨大表的全表扫描则需要主要从磁盘读取。

于 2012-06-28T17:01:53.060 回答
7

实际上,如果不指定查询、表定义和查询计划,就很难说。

如果您的表在您正在计算的列上没有索引MAX,则 Oracle 将不得不进行全表扫描。这将是 O(n),因为您必须扫描表中的每个块。您可以通过查看查询计划看到这一点。

CHAR(1000)我们将生成一个包含 100,000 行的表,并使用列确保行相当大

SQL> create table foo( col1 number, col2 char(1000) );

Table created.

SQL> insert into foo
  2    select level, lpad('a',1000)
  3      from dual
  4   connect by level <= 100000;

100000 rows created.

现在,我们可以看看基本MAX操作的计划。这是进行全表扫描(O(n) 操作)

SQL> set autotrace on;
SQL> select max(col1)
  2    from foo;

 MAX(COL1)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    13 |  4127   (1)| 00:00:50 |
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
|   2 |   TABLE ACCESS FULL| FOO  |   106K|  1350K|  4127   (1)| 00:00:50 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
         29  recursive calls
          1  db block gets
      14686  consistent gets
          0  physical reads
        176  redo size
        527  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

如果您在要计算的列上创建索引MAX,Oracle 可以MIN/MAX scan对索引执行操作。如果这是优化器选择的计划,那是一个 O(log n) 操作。当然,实际上,这在功能上是一个 O(1) 操作,因为索引的高度实际上永远不会超过 4 或 5——这里的常数项将占主导地位。

SQL> create index idx_foo_col1
  2      on foo( col1 );

Index created.

SQL> select max(col1)
  2    from foo;

 MAX(COL1)
----------
    100000


Execution Plan
----------------------------------------------------------
Plan hash value: 817909383

-------------------------------------------------------------------------------------------
| Id  | Operation                  | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |              |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |              |     1 |    13 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 |     1 |    13 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

Statistics
----------------------------------------------------------
          5  recursive calls
          0  db block gets
         83  consistent gets
          1  physical reads
          0  redo size
        527  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

但后来事情变得更难了。两者都MIN具有MAX相同的 O(log n) 行为。但是,如果你在同一个查询中同时拥有两者MIN,那么你MAX突然又回到了 O(n) 操作。Oracle(从 11.2 开始)还没有实现选项抓取索引的第一个块和最后一个块

SQL> ed
Wrote file afiedt.buf

  1  select min(col1), max(col1)
  2*   from foo
SQL> /

 MIN(COL1)  MAX(COL1)
---------- ----------
         1     100000


Execution Plan
----------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    13 |  4127   (1)| 00:00:50 |
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
|   2 |   TABLE ACCESS FULL| FOO  |   106K|  1350K|  4127   (1)| 00:00:50 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
      14542  consistent gets
          0  physical reads
          0  redo size
        601  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

当然,在 Oracle 的后续版本中,可能会实施这种优化,这将回到 O(log n) 操作。当然,您也可以重写查询以获得不同的查询计划,该计划可以追溯到 O(log n)

SQL> ed
Wrote file afiedt.buf

  1  select (select min(col1) from foo) min,
  2         (select max(col1) from foo) max
  3*   from dual
SQL>
SQL> /

       MIN        MAX
---------- ----------
         1     100000


Execution Plan
----------------------------------------------------------
Plan hash value: 3561244922

-------------------------------------------------------------------------------------------
| Id  | Operation                  | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |              |     1 |       |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |              |     1 |    13 |            |          |
|   2 |   INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |  SORT AGGREGATE            |              |     1 |    13 |            |          |
|   4 |   INDEX FULL SCAN (MIN/MAX)| IDX_FOO_COL1 |     1 |    13 |     2   (0)| 00:00:01 |
|   5 |  FAST DUAL                 |              |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------


Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          7  recursive calls
          0  db block gets
        166  consistent gets
          0  physical reads
          0  redo size
        589  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
于 2012-06-28T17:20:49.727 回答