1

我有代表不同条件下患者哮喘症状严重程度的数据。严重性变量是有序因子,都具有相同的级别(轻度 < 中度 < 重度)。这是一个简化的示例:

# Create example data frame
df <- data.frame(
  ID = c(1:5),
  Daytime = c("Mild", "Severe", "Mild", "Moderate", "Moderate"), # severity of daytime symptoms
  Sleep = c("Moderate", NA, "Mild", "Mild", "Moderate"), # severity of nighttime symptoms
  Activity = c("Mild", "Moderate", "Mild", "Moderate", "Severe") # severity of symptoms during activity
  )

# Specify order of factor levels
df$Daytime <- ordered(
  df$Daytime,
  levels = c("Mild",
             "Moderate",
             "Severe")
  )
df$Sleep <- ordered(
  df$Sleep,
  levels = c("Mild",
             "Moderate",
             "Severe")
  )
df$Activity <- ordered(
  df$Activity,
  levels = c("Mild",
             "Moderate",
             "Severe")
)

df

生成的数据框如下所示:

  ID  Daytime    Sleep Activity
1  1     Mild Moderate     Mild
2  2   Severe     <NA> Moderate
3  3     Mild     Mild     Mild
4  4 Moderate     Mild Moderate
5  5 Moderate Moderate   Severe

我正在尝试创建一个“总体严重程度”变量,其中患者的总体严重程度 = 三个类别(白天、睡眠和活动)中报告的最严重症状。也就是说,“整体”等于“白天”、“睡眠”和“活动”中的最高级别。结果将如下所示:

  ID  Daytime    Sleep Activity  Overall
1  1     Mild Moderate     Mild Moderate
2  2   Severe     <NA> Moderate   Severe
3  3     Mild     Mild     Mild     Mild
4  4 Moderate     Mild Moderate Moderate
5  5 Moderate Moderate   Severe   Severe

我想在不写一些大而笨重的for循环的情况下做到这一点,但我不知道怎么做。我想也许我可以用 来做到这一点ave(),但似乎不能同时处理多个变量:

> df$Overall <- ave(c(df$Daytime, df$Sleep, df$Activity),
+                 df$ID,
+                 FUN = function(i) max (i, na.rm=T)
+                 )
Error in `$<-.data.frame`(`*tmp*`, "Worst", value = c(2L, 3L, 1L, 2L,  : 
  replacement has 15 rows, data has 5

是否有可以执行此操作的应用功能?

4

1 回答 1

4

一种快速的方法是:

df$Overall <- apply(df[,2:4], 1, max, na.rm=T)
于 2014-08-18T18:29:59.820 回答