我正在处理一些敏感数据,所以我担心使用 RPostgreSQL。我已将所有必要的数据加载到 R 中的数据帧中。我正在尝试使用sqldf()
R 中的函数对数据运行查询。这些查询是几年前为 Oracle SQL Developer 编写的,因此我们试图避免完全重写脚本. 能够重用预先编写的 SQL 脚本将为我们节省大量时间。当我们点击over()
SQL 函数时,脚本似乎出错了。我知道 base sqldf 不支持该over()
功能。我读过over()
函数适用于 RPostgreSQL 包,但这是否需要我将数据帧发送到外部数据库?根据我对 RpostgreSQL 的理解,您需要连接到 PostgreSQL 并创建一个新数据库。我们无法将此数据发送到外部数据存储系统。是否有另一种方法可以使用该over()
功能,同时将数据帧保持在我的 PC 本地?
select program, importance_level, count( distinct subject_id )
from
(
select r.subject_id,
case
when rc_level is not null and rc_level <> 'NA'
then 'bad_guy'
when (rc_level is null or rc_level = 'NA') and
(substr( r.base_category, 2, 2 ) in ( '5R', '8Q', '8P' )
or r.process_name in ('On The Way'))
then 'run_away'
when (rc_level is null or rc_level = 'NA') and r.process_name =
'Fancy Order'
then 'repeater'
when (rc_level is null or rc_level = 'NA') and
(a.current_program_code in ( 'BOP', 'IAS', 'LIS', 'SIS' )
or method_code in ( 'SIP', 'POB' )
or substr( r.base_category, 2, 2 ) in ( '9F', '7G' ))
then 'NEWBIE'
else 'Other'
end
as importance_level,
case
when a.current_program_code in ('123', 'ABC', 'DEF', 'HIJ', 'KLM', 'NOP', 'QRS' ) then 'YAW'
when a.current_program_code in ( 'RE', 'FDS', 'QWE', 'WER', 'ERT','RTY','TYU' ) then 'PO'
when a.current_program_code in ( 'LEP' ) then 'MOM'
else a.current_program_code
end
as program
from FY16DATA r left join (select distinct * from (select subject_id, first_value(current_program_code) over (partition by subject_id order by start_date desc) as current_program_code, first_value(process_name) over (partition by subject_id order by start_date desc) as process_name, first_value(method_code) over (partition by subject_id order by start_date desc) as method_code, max(load_fy) over (partition by subject_id) as load_fy from FY16NAME)) a on r.subject_id = a.subject_id
where r.load_fy = '2016' and r.thing_status <> 'Over' and r.thing_status in ('Head','Hair','Face')
)
group by program, importance_level;