1

如何使用分析替换以下自联接:

SELECT 
t1.col1 col1,
t1.col2 col2,
SUM((extract(hour FROM (t1.times_stamp - t2.times_stamp)) * 3600 + extract(minute FROM ( t1.times_stamp - t2.times_stamp)) * 60 + extract(second FROM ( t1.times_stamp - t2.times_stamp)) ) ) div,
COUNT(*) tot_count
FROM tab1 t1,
tab1 t2
WHERE t2.col1      = t1.col1
AND t2.col2  = t1.col2
AND t2.col3        = t1.sequence_num
AND t2.times_stamp     < t1.times_stamp
AND t2.col4         = 3
AND t1.col4         = 4
AND t2.col5 NOT IN(103,123)
AND t1.col5     != 549
GROUP BY t1.col1, t1.col2
4

2 回答 2

1

我很确定您将无法用分析替换自联接,因为您使用的是行间操作 ( t1.time_stamp - t2.time_stamp)。Analytics 只能访问当前行的值和行子集上的聚合函数的值(窗口子句)。

有关分析局限性的进一步分析,请参阅Tom Kyte 的这篇文章这篇论文。

于 2010-03-19T10:40:39.840 回答
0

看起来你几乎可以消除自我加入t2替换

t1.time_stamp - t2.time_stamp

有类似的东西

t1.time_stamp - lag(t1.time_stamp) over (partition by col1, col2 order by time_stamp)

t1col4和col5 上的不同过滤器t2会阻止您执行此操作。
分析函数在主查询的 where / group by 之后应用,因此您需要启用单个过滤器t1才能使用滞后/领先来指定序列中的后续行或前面的行。

此外,您需要将 sum/group by 推送到外部查询以在分析函数之后进行聚合:

select col1, col2, sum(timestamp_diff) from (
  select col1, col2, timestamp - lag(timestamp) over(.....) as timestamp_diff
  where ....
) group by col1, col2
于 2014-12-23T17:49:26.230 回答