0

我刚开始学习 R,它非常有用,我正在尝试用它来计算覆盖天数的比例。该指标与衡量一个人对药物的依从性有关。基本上,在给定的时间段内,您会发现药物的所有填充物都根据填充日期和供应天数来确定它们被承保的天数。例如,如果一个人在 2016 年 2 月 1 日获得 35 天的填补,则他们的承保范围从 2016 年 2 月 1 日到 2016 年 3 月 6 日。很容易。

当他们在第一次填充没有覆盖范围之前返回填充时,这会变得很棘手,您不会重复计算天数(例如,该人在 2016 年 3 月 1 日,3/1-3/6 获得第二次填充只计算一次)。

我实际上已经编写了一些似乎可以正常工作的代码,但是它使用了 FOR 循环,我了解到它在 R 中效果不佳,当我开始向它扔一堆数据时我很担心。

这是构建测试数据并初始化一些变量的代码的第一部分:

#Create test data vectors

  Person <- c(rep("Person1",12),rep("Person2",9))
  FillDate <- c("2016-1-1", "2016-2-1", "2016-3-1", "2016-4-1", "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1",    "2016-11-1",    "2016-12-1",    "2016-2-1", "2016-3-1", "2016-4-20",    "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1")
  DaysSupply <- c(rep("35", 14),    "20",   "5",    "20",   rep("35",   4))

  #Build into data.frame
  PDCTestData <- cbind.data.frame(as.factor(Person),as.Date(FillDate,"%Y-%m-%d"),as.numeric(DaysSupply))
  colnames(PDCTestData) <- c("Person","FillDate","DaysSupply")

#Create start and end dates for overall period
StartDate <- as.Date("2016-01-01")
EndDate <- as.Date("2016-12-31")

#Initialize DaysCoveredList, a vector to hold the list of dates that a person has drug coverage
DaysCoveredList <- NULL

#Initialize DaysCoveredTable, a matrix to count the total number of unique days in the DaysCovered List, by person
DaysCoveredTable <- NULL

以及完成实际工作的第二部分:

#Begin looping through individuals
for(p in levels(PDCTestData$Person)){

  #Begin looping through drug fills
  for(DrugSpan in 1:nrow(PDCTestData[PDCTestData$Person == p,])){

    #Create a sequence of the dates covered by that fill, the sequence starts on the fill date and runs for the number of days in Days Supply, Builds a list of all days covered for that person
    DaysCoveredList <- c(DaysCoveredList,seq.Date(from = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$FillDate, length.out = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$DaysSupply, by = "day"))

  } #Exit drug fill loop

  #Counts the number of unique days covered from the DaysCovredList, with in the start and end of the overall period
  DaysCovered <- length(unique(DaysCoveredList[DaysCoveredList >= StartDate & DaysCoveredList <= EndDate]))

  #Adds the unique count from DaysCovered to the summary DaysCoveredTable
  DaysCoveredTable <- rbind(DaysCoveredTable,cbind(p,DaysCovered))

  #Clear DaysCovered and DaysCovredList
  DaysCovered <- NULL
  DaysCoveredList <- NULL
} #Exit the individual loop

感谢您提供的任何帮助。

谢谢。

4

1 回答 1

0
library(lubridate)
ptd <- PDCTestData # I get bored writing long variable names

ptd$EndDate <- ptd$FillDate + ptd$DaysSupply
ptd$DrugInterval <- interval(ptd$FillDate, ptd$EndDate)

all_days <- as.Date(StartDate:EndDate, origin = "1970-01-01")

lapply(unique(ptd$Person), function (y) sum(sapply(all_days, function (x) any(x %within% ptd$DrugInterval[ptd$Person==y]))))

不能保证速度,但可能更容易阅读。

于 2017-03-21T23:35:36.820 回答