我正在处理处方数据,并希望生成一个汇总变量来衡量个人在给定时期内对药物的依从性。此变量称为覆盖天数 (PDC)。我知道制作变量的步骤,但最后无法执行循环。Leslie 等人的文档中概述了这些步骤,它们提供了 SAS 代码。http://www2.sas.com/proceedings/forum2007/043-2007.pdf
第一步是将您的数据组织成宽格式,以便每个独特的人都有自己的行,他们每天服用药物以及服用了多少药物。数据框还有一个索引日期,即个人领取处方的第一个日期(进入研究)和他们的研究结束日期(开始日期 + 180 天跟进)。这一切都很好,这是一个示例数据框。xd = 填写日期和 days_supply = 个人在该日期获得的标签数量。
df[(1:4), c(1,2,3,4,5,6,42,43)]
ID xd.1 days_supply.1 xd.2 days_supply.2 xd.3 start_dt end_dt
1 Patient HAI0674228 2011-05-05 28 2011-05-11 28 2011-05-24 2011-05-05 2011-10-31
10 Patient HAI0937281 2011-01-06 28 2011-03-01 28 2011-03-28 2011-01-06 2011-07-04
12 Patient HAI1007704 2011-01-29 28 2011-03-01 28 2011-03-31 2011-01-29 2011-07-27
18 Patient HAI1028993 2011-05-17 30 2011-06-16 30 0 2011-05-17 2011-11-12
使用数组和循环的下一步是我遇到的麻烦。
首先,我需要为后续期间(180 天)中的每一天创建一个包含虚拟变量的数组,将每个值设置为 0。(这将作为每天的药物覆盖日记 - 是/否有平板电脑)
lapply(1:180, function(i) print(i))->days2
days2[]=0
工作正常
接下来,我需要再制作两个数组,将 xd 变量和 days 供应变量分组。目的是这些将设置do循环;为每个病人填写日记。
df[(1:5), c(1,2,4,6,8,9)]->filldates
filldates
array(filldates)->filldates
is.array(filldates)
df[(1:5), c(1,3,5,7,8,9)]->days_supply
> days_supply
array(days_supply)->days_supply
is.array(days_supply)
工作正常
接下来是设置循环以获取每个数组中的信息(填写日期和供应天数)以填写用药日记。这是我卡住的地方。我希望日记看起来像这样
ID Day 1 Day 2 Day 3 Day 4-Day29 Day 30 Day 31 Day 32 Day 33
X12344 1 1 1 1 0 0 1 1
我将不胜感激有关如何设置循环来做到这一点的任何建议?
先感谢您!
生成此处使用的 DF 的代码:
ID=c("1234", "1233", "1235", "1222") ###random IDs
dt_fill1=as.character(c("2011-05-05", "2011-01-06", "2011-01-29", "2011-05-17"))
days_supp1=c(28,28,28,30)
dt_fill2=as.character(c("2011-05-11", "2011-03-01", "2011-03-01", "2011-06-16"))
days_supp2=c(28,28,28,30)
st_date=as.character(c("2011-05-05", "2011-01-06", "2011-01-29", "2011-05-17"))
end_date=as.charachter(c("2011-10-31", "2011-07-04", "2011-07-27", "2011-11-12")
df=data.frame(ID, dt_fill1, days_supp1, dt_fill2, days_supp2, st_date, end_date)
df
更详细的df:
ID=c("hai0674228", "hai0937281", "hai1007704", "hai1028993", "hai1095329", "hai1537305", "hai1706893", "hai1989514", "hai2202516", "hai2224780")
dt_fill1=as.character(c("2011-05-05", "2011-01-06", "2011-01-29", "2011-05-17", "2011-01-11", "2011-01-26", "2011-01-06", "2011-01-10", "2011-01-07", "2011-04-26" ))
days_supp1=c(28,28,28,30, 28,30,28,28,28,30)
dt_fill2=as.character(c("2011-05-11", "2011-03-01", "2011-03-01", "2011-06-16", "2011-02-08", "2011-03-14", "0", "2011-02-04", "2011-02-05", "2011-05-17"))
days_supp2=c(28,28,28,30,28,30,0,28,28,30)
dt_fill3=as.character(c("2011-05-24", "2011-03-28", "2011-03-31", "0", "2011-03-02", "2011-03-19", "0", "2011-03-02", "2011-03-07", "2011-06-14"))
days_supp3=c(30,28,28,0,28,30,0,28,28,30)
dt_fill4=as.character(c("2011-06-21", "2011-04-27", "2011-04-25", "0", "2011-03-30", "2011-04-15", "0", "2011-03-31", "2011-03-28", "2011-06-29"))
days_supp4=c(28,28,28,0,28,30,0,28,28,30)
dt_fill5=as.character(c("0", "2011-05-20", "2011-05-23", "0", "2011-05-02", "2011-05-12", "0", "2011-04-28", "2011-04-28", "0"))
days_supp5=c(0,28,28,0,28,30,0,28,28,0)
st_date=as.character(c("2011-05-05", "2011-01-06", "2011-01-29", "2011-05-17", "2011-01-11", "2011-01-26", "2011-01-06", "2011-01-10", "2011-01-07", "2011-04-26"))
end_date=as.character(c("2011-10-31", "2011-07-04", "2011-07-27", "2011-11-12", "2011-07-09", "2011-07-24", "2011-07-04", "2011-07-08", "2011-07-05", "2011-10-22"))
df=data.frame(ID, dt_fill1, days_supp1, dt_fill2, days_supp2, dt_fill3, days_supp3, dt_fill4, days_supp4, dt_fill5, days_supp5, st_date, end_date)
df