SQL 是连接这些表的理想工具,因为它在连接数据方面最为灵活。
使用 DomPazz 的测试数据;
data taxes;
informat state $8.
city $12.
Good $12.
tax best.;
input state $ city $ good $ tax;
datalines;
all all all 0.07
all all chicken 0.04
all jackson all 0.01
arizona all meat 0.02
arizona phoenix meat 0.04
arizona tucson meat 0.03
hawaii all all 0.08
hawaii all chicken 0.11
nevada reno cigar 0.11
nevada vegas cigar 0.13
;;;
run;
data to_look_up;
informat lu_state $8.
lu_city $12.
lu_Good $12. ;
input lu_state $ lu_city $ lu_good $;
datalines;
nevada reno cigar
nevada reno chicken
hawaii honalulu chicken
texas dallas steak
;;;
run;
下面的查询将 to_look_up 表中的每一行连接到 tax 表中的行 state 匹配或 state 等于 tax 表中的“all”,city 匹配或 city 等于 tax 表中的“all”,good 匹配或 good 等于 tax 表中的“all”。
这可能导致 tax 表中的多于 1 行与 to_look_up 表中的一行相匹配。虽然我们可以通过优先匹配来选择最佳匹配,即匹配状态之前的状态等于“所有”,对于城市和良好也是如此。
Group By 子句在这里很重要。它应该是 to_look_up 表中变量的唯一组合。有了这个,我们可以为 to_look_up 表中的每一行选择最佳匹配并消除所有其他匹配。
proc sql;
create table taxes_applied as
select *
/* Prioritise state, city and good matches. */
, case when to_look_up.lu_state eq taxes.state then 2
when 'all' eq taxes.state then 1
end as match_state
, case when to_look_up.lu_city eq taxes.city then 2
when 'all' eq taxes.city then 1
end as match_city
, case when to_look_up.lu_good eq taxes.good then 2
when 'all' eq taxes.good then 1
end as match_good
from to_look_up
/* join taxes table on matching state, city and good or matching 'all' rows. */
left join
taxes
on ( to_look_up.lu_state eq taxes.state
or 'all' eq taxes.state
)
and ( to_look_up.lu_city eq taxes.city
or 'all' eq taxes.city
)
and ( to_look_up.lu_good eq taxes.good
or 'all' eq taxes.good
)
/* Process for each row in to_look_up table. */
group by to_look_up.lu_state
, to_look_up.lu_city
, to_look_up.lu_good
/* Select best match. */
having match_state eq max (match_state)
and match_city eq max (match_city)
and match_good eq max (match_good)
order by to_look_up.lu_state
, to_look_up.lu_city
, to_look_up.lu_good
, match_state
, match_city
, match_good
;
quit;
与此类似的连接可用于在汇总表中生成小计。