2

我必须从 Oracle 外部表中执行许多选择。

我有 10 个看起来很像这样的游标(ext_temp 是外部表)

CURSOR F_CURSOR (day IN varchar,code Number,orig Number)
    IS
    select NVL(sum(table_4.f),0) 
     from ext_temp table_4
    where
      --couple of conditions here, irrelevant for the question at hand.
      AND TO_CHAR(table_4.day,'YYYYMMDD') = day
      AND table_4.CODE = code
      AND table_4.ORIG = orig;

外部表有大约 22659 个寄存器。

我的脚本主循环看起来像这样

   for each register in some_query: --22659 registers
       open F_cursor(register.day,register.code,register.orig);
       --open 9 more cursors

       fetch F_cursor into some_var;  
       --fetch 9 more cursors, with the same structure

查询越来越多。而且我从这里知道我不能有任何索引或 DML。

那么,有没有办法让它跑得更快呢?我可以重写我的 plsql 脚本,但我认为我没有时间了。

更新:错过了一个重要的细节。

我不是数据库的所有者或 DBA。那家伙不希望他的数据库中有任何额外的信息(大约 3gb 的数据),而我们可以从他那里得到的只有外部表。他不允许我们创建临时表。我不会假装质疑他的理由,但外部表不是解决这个问题的方法。所以,我们被他们困住了。

4

4 回答 4

4

让它们成为Oracle桌子。

外部表可以替换SQL*LOADER,而不是每天使用它们。

每当您进行基础文件更改时,只需运行一个导入脚本,这会将外部表的内容加载到Oracle表中。

这是你的同名想法(从这里偷来的):

您正在使用外部表而不是 sqlldr.

使用外部表,您可以

  • 在一个语句中将平面文件与现有表合并。
  • 在途中将平面文件排序到您想要很好地压缩的表中。
  • 执行并行直接路径加载——无需拆分输入文件、编写大量脚本等
  • sqlldr存储过程或触发器生效(插入不是sqlldr
  • 做多表插入
  • 通过管道化 plsql 函数传输数据以进行清理/转换

等等。它们而不是 sqlldr- 将数据放入数据库而不必首先使用sqlldr

您通常不会在操作系统中每天查询它们,而是使用它们来加载数据。

更新:

您将永远无法使用3GB表获得良好的性能,因为Oracle必须3GB对每个查询进行全扫描,这将是一流的磁盘读取主轴移动全扫描,而不是您可以在中看到的廉价缓存模仿计划但在实际执行时间中几乎无法注意到。

尝试说服该人为您创建一个临时表,您可以使用该表来处理数据,并在会话开始时从外部表中加载数据。

这不是最好的解决方案,因为它需要在临时表空间中为每个会话保留单独的表副本,但它在性能方面要好得多。

于 2009-10-20T17:47:48.080 回答
3

如果您必须解决没有意义但无法更改的限制,那真的很难......

您最好阅读外部表一次,然后在代码中以类似索引的数据结构构建所需的数据(基本上是一个数组,您要查找的每个寄存器都有一个元素)。

所以你的光标看起来像这样:

CURSOR F_CURSOR (day IN varchar, orig IN Number)
    IS
    select NVL(sum(table_4.f),0) value, table_4.CODE register
     from ext_temp table_4
    where
      --couple of conditions here, irrelevant for the question at hand.
      AND TO_CHAR(table_4.day,'YYYYMMDD') = day
      -- AND table_4.CODE = code -- don't use this condition!
      AND table_4.ORIG = orig;

你的寄存器循环会变成一个游标循环:

open F_cursor(register.day,register.orig);
LOOP
    fetch F_cursor into some_var;
    EXIT WHEN F_cursor%NOT_FOUND
    result (some_var.register) := some_var.value;
END LOOP;

因此,您无需为每个寄存器循环遍历外部表,而只需为所有寄存器循环一次。

这可以扩展到您提到的十个游标。

于 2009-10-20T20:29:55.257 回答
0

您可以将外部表数据写入临时索引(如果需要)表,然后对其执行多个查询。

create your_temp_table as select * from ext_temp;
create index your_desired_index on your_temp_table(indexed_field);

然后直接使用 your_temp_table 进行所有查询。

于 2009-10-20T17:47:46.677 回答
0

While totally agreeing with Quassnoi's suggestion that external tables do not appear to be the proper solution here, as well as DCookie's analogy that you're being bound and tossed overboard and asked to swim, there may at least be a way to structure your program so that the external table is only read once. My belief from your description is that all 10 cursors are reading from the external table, meaning that you are forcing Oracle to scan the external table 10 times.

Assuming this inference is correct, the simplest answer is likely to make the external table the driving cursor, similar to what IronGoofy suggested. Depending on what some_query in the code snippet below is doing,

for each register in some_query

and assuming that the fact that the query returns the same number of rows that are in the external table is not a coincidence, the simplest option would be to do something like

FOR register in (select * from ext_temp)
LOOP
  -- Figure out if the row should have been part of cursor 1
  IF( <<set of conditions>> ) 
  THEN
    <<do something>>
  -- Figure out if the row should have been part of cursor 2
  ELSIF( ... )
  ...
END LOOP;

or

FOR register in (select * 
                   from ext_temp a, 
                        (<<some query>>) b 
                  where a.column_name = b.column_name )
LOOP
  -- Figure out if the row should have been part of cursor 1
  IF( <<set of conditions>> ) 
  THEN
    <<do something>>
  -- Figure out if the row should have been part of cursor 2
  ELSIF( ... )
  ...
END LOOP;

It should be more efficient to take things a step further and move logic out of the cursors (and IF statements) and into the driving cursor. Using the simpler of the code snippets above (you could, of course, join some_query to these examples

FOR register in (select a.*,
                        NVL(sum( (case when condition1 and condition2
                                       then table_4.f
                                       else 0
                                       end) ),
                             0) f_cursor_sum
                  from ext_temp table_4)
LOOP
  <<do something>>
END LOOP;

If, even after doing this, you still find that you are doing some row-by-row processing, you could even go one more step forward and do a BULK COLLECT from the driving cursor into a locally declared collection and operate on that collection. You almost certainly don't want to fetch 3 GB of data into a local collection (though crushing the PGA might lead the DBA to conclude that temporary tables aren't such a bad thing, it's not something I would advise), fetching a few hundred rows at a time using the LIMIT clause should make things a bit more efficient.

于 2009-10-20T21:36:03.973 回答