0
library(tidyverse)
library(stringr)
library(lazyeval)

下面是一个简单的数据框示例的数据...

Respondent<-c("Respondent1","Respondent2","Respondent3","Respondent4","Respondent5")
Sat1<-c("1 Extremely dissatisfied","2 Moderately dissatisfied","2 Moderately Dissatisfied","4 Neutral","7 Extrmely satified")
Sat2<-c("7 Extremely Satisfied","2. Moderately dissatisfied","4 Neutral","3  Slightly dissatisfied","3 Slightly Dissatisfied")
Sat3<-c("1 Extremely dissatisfied","7 Extremely satisfied","6 Moderately satisfied","4. Neutral","3 Slightly dissatisfied")
Pet<-c("Cat","Cat","Dog","Hamster","Rabbit")

df <- data_frame(Respondent,Sat1,Sat2,Sat3,Pet)

下面的代码是将满意度得分列重新编码为满意、不满意和中立三个类别。

df %>% 
mutate_at(vars(starts_with("Sat")), 
     funs(fct_collapse(factor(str_sub(., 1, 1), levels = as.character(1:7)),
                          Satisfied = c("7","6","5"),
                          Dissatisfied =c ("3", "2","1"),
                          Neutral = "4")))

但是,我的真实示例涉及为多个文件重新编码相同的满意度等级,每个文件都有不同数量的满意度等级列。所以我想把它包装成一个函数,允许我输入数据框名称,以及要重新编码的任意数量的列。下面是我正在尝试使用的代码的一种变体,但我无法让它工作。我一直在玩 .dots 和“...”,但找不到任何有用的东西。

 REC<-function(data,...){
data %>% 
 mutate_at(vars(...), 
     funs(fct_collapse(factor(str_sub(., 1, 1), levels = as.character(1:7)),
                   Satisfied = c("7","6","5"),
                   Dissatisfied =c ("3", "2","1"),
                   Neutral = "4")))
                   }

我应该对 mutate_at 使用标准评估吗?另外,我是否必须将 .dots 与 ... 一起使用?如果标准评估不适用于 mutate_at,我愿意使用其他功能/技术来实现相同的最终目标,最好是在 tidyverse 中。

4

1 回答 1

1

starts_with("Sat")适用于您的所有文件吗?如果是这样,无论有多少以“Sat”开头的列,该功能都将起作用。

REC <- function(data){
  data %>% 
    mutate_at(vars(starts_with("Sat")),
                   funs(fct_collapse(factor(str_sub(., 1, 1), levels = as.character(1:7)),
                                     Satisfied=c("7","6","5"),
                                     Dissatisfied=c("3", "2","1"),
                                     Neutral="4")))
} 

如果要传递要更改的列的索引,可以尝试:

REC <- function(data, variable){
  data %>% 
    mutate_at(vars(variable),
              funs(fct_collapse(factor(str_sub(., 1, 1), levels = as.character(1:7)),
                                    Satisfied=c("7","6","5"),
                                    Dissatisfied=c("3", "2","1"),
                                    Neutral="4")))
}  

然后REC(df, 2:4)会给你这个输出

# A tibble: 5 × 5
   Respondent         Sat1         Sat2         Sat3     Pet
        <chr>       <fctr>       <fctr>       <fctr>   <chr>
1 Respondent1 Dissatisfied    Satisfied Dissatisfied     Cat
2 Respondent2 Dissatisfied Dissatisfied    Satisfied     Cat
3 Respondent3 Dissatisfied      Neutral    Satisfied     Dog
4 Respondent4      Neutral Dissatisfied      Neutral Hamster
5 Respondent5    Satisfied Dissatisfied Dissatisfied  Rabbit
于 2017-03-02T09:30:08.357 回答