0

我需要重新编码几个因子变量,但我一直在失败。

假设我的数据如下所示:

df <- data.frame(a  = c("1","2","","Other"),
                 b  = c("3","","Other","Other"),
                 v1 = c("no","no","yes","yes"),
                 v2 = c("no","yes","no","no"),
                 v3 = c("no","yes","yes","no"))
df$a <- as.character(df$a)
df$b <- as.character(df$b)

df

>       a     b  v1  v2  v3
> 1     1     3  no  no  no
> 2     2        no yes yes
> 3       Other yes  no yes
> 4 Other Other yes  no  no

我想

v1成为"yes" if (a=="1" | b=="1")

v2成为"yes" if (a=="2" | b=="2")

v3成为"yes" if (a=="3" | b=="3")

所以模式是:

v#成为"yes" if (a="#" | b="#")

我尝试使用 2 个循环使用 R 基础,但它不起作用:

 for(i in c("a","b")){
   for(j in as.character(1:3)){
   df[which(df[,i]==j),][,c(paste("v",j,sep=""))] <- "yes"
   }}

我更喜欢使用 来执行此操作dplyr::mutate,但不知道如何...

4

2 回答 2

1
library(data.table)
dt = as.data.table(df) # or convert in-place using setDT

for (i in 1:3) dt[a == i | b == i, paste0('v', i) := 'yes']
#       a     b  v1  v2  v3
#1:     1     3 yes  no yes
#2:     2        no yes yes
#3:       Other yes  no yes
#4: Other Other yes  no  no
于 2016-05-11T16:16:21.027 回答
-1

在这种情况下,由于变量的数量相对较少,一个简单的 ifelse 组就可以解决问题:

df$v1<- ifelse(df$a==1 |df$b==1,"yes","no")
df$v2<- ifelse(df$a==2 |df$b==2,"yes","no")
df$v3<- ifelse(df$a==3 |df$b==3,"yes","no")
于 2016-05-11T13:52:12.447 回答