如果我理解正确,您想根据字符串变量的值创建一个二分变量,例如:如果TypeOfOutcome匹配“患有稳定疾病”、“患有进行性疾病”或“患有完全反应”中的任何一个,Outcome将为 1 否则为 0。我假设您的数据集与此类似
mergedData <- data.frame(
TypeOfOutcome = c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response", NA, "Died from melanoma"),
DateStageIV = sample(seq(as.Date('2011/01/01'), as.Date('2015/01/01'), by="day"), 5))
# TypeOfOutcome DateStageIV
# 1 Alive with stable disease 2013-05-09
# 2 Alive with progressive disease 2014-08-08
# 3 Alive with complete response 2013-02-10
# 4 <NA> 2014-05-23
# 5 Died from melanoma 2012-08-08
该函数ifelse适用于这种从重新编码,基本语法是:
ifelse(test, yes, no)
如果中的语句test为真,则返回 的值,yes否则返回 的值no。在这种情况下test是患者仍然活着的所有情况,由字符串表示,TypeofOutcome“疾病稳定”、“疾病进展”或“完全缓解”。对此的测试将是:
test <- mergedData$TypeOfOutcome %in% c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response")
test如果in 中TRUE的值与运算符TypeOfOutcome之后的任何情况匹配%in%。yes然后将为 1 和no0。创建新变量
mergedData$Outcome <- ifelse(test, 1, 0)
mergedData
# TypeOfOutcome DateStageIV Outcome
# 1 Alive with stable disease 2013-05-09 1
# 2 Alive with progressive disease 2014-08-08 1
# 3 Alive with complete response 2013-02-10 1
# 4 <NA> 2014-05-23 0
# 5 Died from melanoma 2012-08-08 0