r - 合并长度相等但不同级别忽略NA的因子变量

Question

我有来自各种来源的调查数据。大多数是不同水平的因子变量。合并时，这意味着存在相同长度的变量，每个变量都包含许多带有信息的行，而其他行是NA. 因此，在合并完整 df 中的每一行时，应该在其中包含信息，同时忽略NA's 并保持相同的长度。

我已经尝试过这个包，因为它包含操纵不同因子水平的函数，但我还没有找到一个解决方案，可以满足在将不同因子与其相应水平合并的同时forcats去除's。NA

v1 <- as.factor(c("a","b","c","x","x",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
v2<- as.factor(c(NA,NA,NA,NA,NA,"c","c","c","b","a",NA,NA,NA,NA,NA))
v3<- as.factor(c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,"f","c","c","b","a"))
df<- data.frame(v1,v2,v3)

合并变量应该看起来像一个包含以下内容的因子：

("a","b","c","x","x","c","c","c","b","a","f","c","c","b","a")

score 1 · Accepted Answer

library(magrittr)

lapply(df, function(x){
  x[!is.na(x)] %>%
    t %>%
    as.character
  }) %>%
  unlist %>%
  as.factor %>%
  `names<-`(NULL)

 [1] a b c x x c c c b a f c c b a
Levels: a b c f x

score 1 · Accepted Answer

library(tidyverse)

map(df, ~na.omit(.x)) %>% unlist %>% unname
 [1] a b c x x c c c b a f c c b a
Levels: a b c x f

score 1 · Accepted Answer

我们可以用coalesce

library(dplyr)
df %>% 
   transmute(v = coalesce(!!! .)) %>% 
   pull(v)
#[1] "a" "b" "c" "x" "x" "c" "c" "c" "b" "a" "f" "c" "c" "b" "a"

或者更紧凑

library(purrr)
reduce(df, coalesce)
#[1] "a" "b" "c" "x" "x" "c" "c" "c" "b" "a" "f" "c" "c" "b" "a"

或在base R

do.call(pmin, c(lapply(df, as.character), na.rm = TRUE))
#[1] "a" "b" "c" "x" "x" "c" "c" "c" "b" "a" "f" "c" "c" "b" "a"

score 1 · Accepted Answer

在基数 R 中，我们可以使用unlistthenFilter来省略NA值。

Filter(function(x) !is.na(x) , unlist(df, use.names = FALSE))
#[1] a b c x x c c c b a f c c b a
#Levels: a b c x f

r - 合并长度相等但不同级别忽略NA的因子变量

4 回答 4

Related

Reference