如果我理解正确,您想根据字符串变量的值创建一个二分变量,例如:如果TypeOfOutcome
匹配“患有稳定疾病”、“患有进行性疾病”或“患有完全反应”中的任何一个,Outcome
将为 1 否则为 0。我假设您的数据集与此类似
mergedData <- data.frame(
TypeOfOutcome = c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response", NA, "Died from melanoma"),
DateStageIV = sample(seq(as.Date('2011/01/01'), as.Date('2015/01/01'), by="day"), 5))
# TypeOfOutcome DateStageIV
# 1 Alive with stable disease 2013-05-09
# 2 Alive with progressive disease 2014-08-08
# 3 Alive with complete response 2013-02-10
# 4 <NA> 2014-05-23
# 5 Died from melanoma 2012-08-08
该函数ifelse
适用于这种从重新编码,基本语法是:
ifelse(test, yes, no)
如果中的语句test
为真,则返回 的值,yes
否则返回 的值no
。在这种情况下test
是患者仍然活着的所有情况,由字符串表示,TypeofOutcome
“疾病稳定”、“疾病进展”或“完全缓解”。对此的测试将是:
test <- mergedData$TypeOfOutcome %in% c("Alive with stable disease", "Alive with progressive disease", "Alive with complete response")
test
如果in 中TRUE
的值与运算符TypeOfOutcome
之后的任何情况匹配%in%
。yes
然后将为 1 和no
0。创建新变量
mergedData$Outcome <- ifelse(test, 1, 0)
mergedData
# TypeOfOutcome DateStageIV Outcome
# 1 Alive with stable disease 2013-05-09 1
# 2 Alive with progressive disease 2014-08-08 1
# 3 Alive with complete response 2013-02-10 1
# 4 <NA> 2014-05-23 0
# 5 Died from melanoma 2012-08-08 0