oracle11g - 在 SELECT 中返回类似 TABLE 的结构

Question

我正在与 SQR 合作编写一份报告。我无法改变数据库的结构，也无法使用 PL/SQL 来完成这项任务。

由于报告可以从远程位置运行，我不想从 SQR 中多次调用数据库。我的目标是在 1 个 SQL 中返回所有内容，其中仅包含我需要报告的记录，以增加慢速连接的运行时间。

我现在可以使用它，但我担心数据库的性能。

“事务”表具有可用于此目的的以下字段：

account_num number(10) -- the account number
seq_num number(10) -- not a real sequence, it is unique to account_num
check_num number(10) -- the number on the check
postdate date

主键是 (account_num, seq_num)

示例数据如下所示：

account_num    seq_num  check_num   postdate
----------- ---------- ---------- ----------
          1         11        200 2014-07-13
          1         16        201 2014-07-14
          1         23        205 2014-07-15
          2         52        282 2014-07-13
          2         66        284 2014-07-14
          2         72        231 2014-07-15
          3         11        201 2014-07-13
          3         12        202 2014-07-14
          3         15        203 2014-07-15

注意：表中还有许多其他类型的交易，但我们正在过滤交易类型的列表，这对这个问题不是很重要，所以我把它省略了。交易量似乎平均每月约为 750,000 笔（针对所有交易，而不仅仅是支票），其中平均报告了约 10,000 笔支票交易。

选择标准是返回发生在两个日期（包括 - 通常是一个月的第一天和一个月的最后一天）之间发生的所有支票交易，其中帐户的任何排序支票号码之间的差异大于 X（我们将使用10 在这种情况下）。

使用上面的示例数据，结果如下所示：

account_num    seq_num  check_num   postdate
----------- ---------- ---------- ----------
          2         52        282 2014-07-13
          2         66        284 2014-07-14
          2         72        231 2014-07-15

由于 check_num 282 和 231 之间的差值大于 10，所以返回 account_num 2 的所有支票。

我构建了以下 SQL 来返回上面的结果：

select
  t1.*
from
  transactions t1
join (
  select
    t3.account_num,
    t3.min_postdate,
    t3.max_postdate,
    max(t3.check_diff)
  from (
    select distinct
      t4.account_num,
      lead(t4.check_num, 1, t4.check_num) over (partition by t4.account_num order by t4.check_num) - t4.check_num as check_diff,
      min(t4.postdate) over (partition by t4.account_num) min_postdate,
      max(t4.postdate) over (partition by t4.account_num) max_postdate
    from
      transactions t4
    where
      t4.postdate between trunc(sysdate,'mm') and last_day(trunc(sysdate))) t3
  group by
    t3.account_num,
    t3.min_postdate,
    t3.max_postdate
  having max(t3.check_diff) > 10) t2
    on t1.account_num = t2.account_num
    and t1.postdate between t2.min_postdate and t2.max_postdate
;

我想从 t4 返回所有检查的 seq_num，所以我最终使用 t1 上的主键。我尝试过使用 LISTAGG，它可以将数字放在一起。

listagg(t4.seq_num,',') within group (order by seq_num) over (partition by account_num) sqe_nums

但这是我卡住的地方......使用逗号分隔的字符串。我可以使用 INSTR 让它工作，但它不能使用主键并且性能很糟糕。

instr(t1.seq_num || ',', t2.seq_nbrs || ',') > 0

我试着加入它：

join (
  select
    t2.account_num,
    regexp_substr(t2.seq_nums,'[^,]+{1}',1,level) seq_num
  from
    dual
  connect by
    level <= length(regexp_replace(t2.seq_nums,'[^,]*')) + 1) t5
  on t1.account_num = t5. accout_num 
  and t1.sqe_num = t5.seq_num

但我应该更清楚 (ORA-00904) - t2 在连接的选择中永远不可见。

有没有人有任何聪明的想法？

score 2 · Accepted Answer

我会通过使用子查询和更多分析函数来完全避免连接：

select
  account_num, seq_num, check_num, postdate
from
  (
    select account_num,
      seq_num,
      check_num,
      postdate,
      max(check_gap) over (partition by account_num) as max_check_gap
    from
      (
        select account_num,
          seq_num,
          check_num,
          postdate,     
          lead(check_num) over (partition by account_num order by check_num)
            - check_num as check_gap
        from
          transactions
        where postdate between trunc(sysdate,'mm') and last_day(trunc(sysdate))
    )
  )
where
  max_check_gap > 10
order by account_num, check_num;

SQL Fiddle with you original query，误读 10-check gap 规则的中间尝试，以及这个版本。所有这些数据都给出了相同的结果。

这并没有解决您提出的具体问题，但希望以不同的方式解决您潜在的性能问题。

如果您确实想坚持使用连接 - 它多次命中表因此效率较低 - 您可以使用collect. 这是一个粗略的方法，table访问可能会得到改善：

select
  t1.*
from
  transactions t1
join (
  select
    t3.account_num,
    collect(t3.seq_num) as seq_nums,
    t3.min_postdate,
    t3.max_postdate,
    max(t3.check_diff)
  from (
    select distinct
      t4.account_num,
      t4.seq_num,
      lead(t4.check_num, 1, t4.check_num) over (partition by t4.account_num order by t4.check_num) - t4.check_num as check_diff,
      min(t4.postdate) over (partition by t4.account_num) min_postdate,
      max(t4.postdate) over (partition by t4.account_num) max_postdate
    from
      transactions t4
    where
      t4.postdate between trunc(sysdate,'mm') and last_day(trunc(sysdate))) t3
  group by
    t3.account_num,
    t3.min_postdate,
    t3.max_postdate
  having max(t3.check_diff) > 10) t2
    on t1.account_num = t2.account_num
    and t1.seq_num in (select * from table(t2.seq_nums))
;

SQL 小提琴。

oracle11g - 在 SELECT 中返回类似 TABLE 的结构

1 回答 1

Related

Reference