我有两张桌子。第一个包含一些激活,第二个包含一些停用。
我必须使用以下规则将一次停用与仅一次激活相关联:
- 激活必须先于停用,但不得超过 92 天。
- 已与停用关联的激活不能再次关联。
因此,使用一些数据:
--a activations, b - deactivations
create table a (id1 integer, date1 date);
create table b (id2 integer, date2 date);
insert into a values (1, '1-Feb-2013');
insert into a values (2, '2-Feb-2013');
insert into a values (3, '3-Feb-2013');
insert into a values (4, '1-Mar-2013');
insert into a values (5, '2-Mar-2013');
insert into a values (6, '1-May-2013');
insert into a values (7, '19-May-2013');
insert into b values (1, '1-May-2013');
insert into b values (2, '1-May-2013');
insert into b values (3, '15-May-2013');
insert into b values (4, '16-May-2013');
insert into b values (5, '17-May-2013');
insert into b values (6, '18-May-2013');
期望的输出:
id1 date1 id2 date2
1 February, 01 2013 00:00:00+0000 1 May, 01 2013 00:00:00+0000 1 1
2 February, 02 2013 00:00:00+0000 2 May, 01 2013 00:00:00+0000 2 2
4 March, 01 2013 00:00:00+0000 3 May, 15 2013 00:00:00+0000 4 3
5 March, 02 2013 00:00:00+0000 4 May, 16 2013 00:00:00+0000 5 4
6 May, 01 2013 00:00:00+0000 5 May, 17 2013 00:00:00+0000 6 5
生成候选人的查询将是:
select id1, date1, id2, date2
from a
join b
on a.date1 >= b.date2 - 91
and b.date2 >= a.date1;
我成功地使用 connect by 创建了一个正确的查询,但是速度太慢了(我有数百万个客户端,每个客户端有数千个设备激活和停用。这个例子是针对一个客户端的。)
with chrn as
(
select id1, date1, id2, date2,
dense_rank() over ( order by date1, id1) as act_ord,
dense_rank() over ( order by date2, id2) as deact_ord
from a
join b
on a.date1 >= b.date2 - 91
and b.date2 >= a.date1
)
select *
from (
select s.*, row_number() over (partition by lvl order by act_ord+deact_ord) as rnk
from (
select a1.*, level lvl
from chrn a1
connect by
prior deact_ord < deact_ord and
prior act_ord < act_ord and
(prior deact_ord = deact_ord - 1 or prior act_ord = act_ord - 1)
start with deact_ord = 1 and act_ord = 1
)s
)where rnk =1
;
我想为此找到一个更快的解决方案,也许只使用分析函数。由于候选者和路径的数量过多,递归查询太慢了。或者我没有成功减少候选人和路径的数量。