0

我正在尝试使用 R 来衡量一个人在进行补充时已经有多少天的处方供应,同时考虑到所有以前的处方。例如,如果我有这张桌子......

  member rx_id  fill_date    to_date days_supply
1      A     1 2018-10-01 2018-10-02           2
2      B     1 2016-11-07 2016-11-10           4
3      B     2 2016-11-07 2016-12-04          28
4      B     3 2016-11-08 2016-11-09           2
5      B     4 2016-11-10 2016-12-03          24

我期望以下输出

  member rx_id  fill_date    to_date days_supply_on_hand
1      A     1 2018-10-01 2018-10-02                   0
2      B     1 2016-11-07 2016-11-10                   0
3      B     2 2016-11-07 2016-12-04                   4
4      B     3 2016-11-08 2016-11-09                  30
5      B     4 2016-11-10 2016-12-03                  26

对于会员B,当第二个脚本与第一个脚本在同一天填写时,个人手头已经有4天的RX。当第三个脚本被填满时,个人距离第一个脚本还剩 3 天,距离第二个脚本还剩 27 天(总共 30 天)。当第四个脚本被填满时,第三个脚本被耗尽,但距离第一个脚本还有 1 天,距离第三个脚本还有 25 天(共 26 个)。

我知道如何在 dplyr 和 data.table 中进行滚动总计,但我不知道如何根据以前的记录逐个考虑不同的消耗水平。以下是重新制作原始表格的代码,提前感谢您的任何建议!

structure(list(member = structure(c(1L, 2L, 2L, 2L, 2L), .Label = 
c("A", 
"B"), class = "factor"), rx_id = c(1, 1, 2, 3, 4), fill_date = 
structure(c(17805, 
17112, 17112, 17113, 17115), class = "Date"), to_date = 
structure(c(17806, 
17115, 17139, 17114, 17138), class = "Date"), days_supply = c(2, 
4, 28, 2, 24)), .Names = c("member", "rx_id", "fill_date", 
"to_date", 
"days_supply"), row.names = c(NA, -5L), class = "data.frame")
4

2 回答 2

2
library(data.table)
dt = as.data.table(your_df) # or setDT to convert in place

# merge on relevant days, then compute sum of supply - days elapsed
dt[dt, on = .(member, fill_date <= fill_date, to_date >= fill_date, rx_id < rx_id), by = .EACHI,
   sum(days_supply, na.rm = T) - sum(i.fill_date - x.fill_date, na.rm = T)]
#   member  fill_date    to_date rx_id      V1
#1:      A 2018-10-01 2018-10-01     1  0 days
#2:      B 2016-11-07 2016-11-07     1  0 days
#3:      B 2016-11-07 2016-11-07     2  4 days
#4:      B 2016-11-08 2016-11-08     3 30 days
#5:      B 2016-11-10 2016-11-10     4 26 days
于 2018-12-27T18:28:33.957 回答
1

使用简单的循环

dt$days_supply_on_hand <- 0
for (a in unique(dt$member)) {
  I <- which(.subset2(dt,1) == a)
  flDate <- as.integer(.subset2(dt,3)[I])
  toDate <- as.integer(.subset2(dt,4)[I])
  V <- vapply(seq_along(I), function (k) sum(toDate[1:(k-1)] - flDate[k] + 1), numeric(1))
  dt$days_supply_on_hand[I] <- c(0,V[-1])
}
dt
  member rx_id  fill_date    to_date days_supply days_supply_on_hand
1      A     1 2018-10-01 2018-10-02           2                   0
2      B     1 2016-11-07 2016-11-10           4                   0
3      B     2 2016-11-07 2016-12-04          28                   4
4      B     3 2016-11-08 2016-11-09           2                  30
5      B     4 2016-11-10 2016-12-03          24                  26

dt上面提供的数据框在哪里。(请注意,使用.subset2oras.integer是为了提高效率 - 可以更改它们以提高可读性)。

于 2018-12-27T18:28:14.110 回答