r - 基于其他列 R 的值范围的新变量

Question

我知道有很多类似的问题，但我无法得到答案。

我需要做的是将一个数值变量分成三个级别。

我在其他一些事情旁边尝试的是以下内容：


data_long$average_success_grouped <- recode(data_long$average_success, <0.5 = no success, >0.5 & <0.9 = little success, >0.9 = success)

我的值范围为 0 - 1，我需要将三组的值分别设置为 0.5 和 0.9。

有人可以帮忙吗？

当前错误：“data_long$average_success <-重新编码（data_long$average_success，<”中的意外'<'

dput(data_long_migraine)
structure(list(average_success = c(0.333333333333333, 0.416666666666667, 0, 0.25, 
0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 
0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 
0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 
1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 
0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 
0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 
0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 
0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 
0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 
0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 
0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 
0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 
0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 
0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 
0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 
0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 
0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 
0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 
0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 
1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 
0.583333333333333, 0.194444444444444), month = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("bad_days_1_month", 
"bad_days_2_month", "bad_days_3_month", "bad_days_4_month"
), class = "factor"), bad_days = c(5, 3, 8, 5, 0, 13, 2, 
3, 10, 13, 7, 3, 2, 23, 5, 4, 6, 17, 4, 3, 13, 10, 4, 8, 15, 
18, 2, 7, 7, 10, 1, 2, 10, 3, 0, 3, 16, 8, 4, 4, 26, 2, 6, 10, 
25, 5, 3, 11, 7, 4, 6, 11, 18, 4, 5, 7, 6, 7, 2, 11, 6, 0, 5, 
20, 4, 2, 4, 20, 0, 2, 2, 24, 6, 4, 4, 5, 3, 7, 8, 6, 2, 9, 8, 
8, 7, 3, 8, 6, 0, 5, 20, 9, 8, 2, 22, 1, 1, 5, 25, 3, 1, 6, 3, 
3, 4, 8, 11, 0), average_success_grouped = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, 
-108L), class = "data.frame")

我之前尝试过其他一些方法，这导致我进入了average_success_grouped，其中只有“2”，但我记不清了

score 2 · Accepted Answer

加载 tidyverse 库和数据

听起来你需要一个ifelse声明。首先，加载 tidyverse 包为级别添加一个新变量：

library(tidyverse)

我首先将您的 dput 保存到一个名为df：

df <- structure(list(average_success = c(0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444, 0.333333333333333, 0.416666666666667, 0, 0.25, 0.166666666666667, 0.133333333333333, 0.0285714285714286, 0, 0.266666666666667, 1, 0.214285714285714, 0.472222222222222, 0.0142857142857143, 0.305555555555556, 0.861111111111111, 0.614285714285714, 0.371428571428571, 1, 0.694444444444444, 0, 0.5, 1, 0.9, 0.0571428571428571, 0.128571428571429, 0.583333333333333, 0.194444444444444), month = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("bad_days_1_month", "bad_days_2_month", "bad_days_3_month", "bad_days_4_month" ), class = "factor"), bad_days = c(5, 3, 8, 5, 0, 13, 2, 3, 10, 13, 7, 3, 2, 23, 5, 4, 6, 17, 4, 3, 13, 10, 4, 8, 15, 18, 2, 7, 7, 10, 1, 2, 10, 3, 0, 3, 16, 8, 4, 4, 26, 2, 6, 10, 25, 5, 3, 11, 7, 4, 6, 11, 18, 4, 5, 7, 6, 7, 2, 11, 6, 0, 5, 20, 4, 2, 4, 20, 0, 2, 2, 24, 6, 4, 4, 5, 3, 7, 8, 6, 2, 9, 8, 8, 7, 3, 8, 6, 0, 5, 20, 9, 8, 2, 22, 1, 1, 5, 25, 3, 1, 6, 3, 3, 4, 8, 11, 0), average_success_grouped = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), row.names = c(NA, -108L), class = "data.frame")

新数据框：

mutate然后使用and使用 if/then 语句创建新变量ifelse：

df2 <- df %>%
  mutate(success_level = ifelse(average_success >.9 , "high success", 
                                ifelse(average_success <.5, "no success", "little")))

查看结果

如果你现在使用View(df2)，你会得到这个新的数据框：

r - 基于其他列 R 的值范围的新变量

1 回答 1

加载 tidyverse 库和数据

新数据框：

查看结果

Related

Reference