-1

有了以下数据...

library(tidyverse)

df_fac <- data_frame("author_1" = c("Ted", "Fred", NA, "Jim", "Tim"), 
                 "role_1" = c("Faculty", "Faculty", "Staff", "Faculty", "Faculty"),
                 "author_2" = c(NA, "Will", NA, "Bill", NA),
                 "role_2" = c("Staff", "Faculty", "Staff", "Faculty", "Staff"))

df_all <- data_frame("author_1" = c("Ted", "Fred", "Simon", "Jim", "Tim"), 
                     "role_1" = c("Faculty", "Faculty", "Staff", "Faculty", "Faculty"),
                     "author_2" = c("Sam", "Will", "Noah", "Bill", "Luther"),
                     "role_2" = c("Staff", "Faculty", "Staff", "Faculty", "Staff"))

如果中的“作者”列df_facNA,我希望它们df_all使用来自的map函数填充相应的列值purrr。这是我目前在没有循环的情况下所做的:

df_test <- df_fac %>%
  mutate(`author_1` = ifelse(is.na(`author_1`), df_all$`author_1`, `author_1`)) %>%
  mutate(`author_2` = ifelse(is.na(`author_2`), df_all$`author_2`, `author_2`))

我可以对map_df中的列进行迭代df_fac,但不能在 中df_all(如您所见,它只是作者列 1)。

df_test <- map_df(select(df_fac, matches("author.\\d$")), ~ {
  ifelse(is.na(.), df_all$`author_1`, .)
})

有没有办法在map_df迭代select(df_all, matches("author.\\d$"))时进行迭代select(df_fac, matches("author.\\d$"))

对于玩具示例,df_test作者列和值应与df_all. 我努力了:

df_test <- map_df(1:length(select(df_fac, matches("author.\\d$"))), ~ {
  ifelse(is.na(select(df_fac, matches("author.\\d$"))[.]), 
  select(df_all, matches("author.\\d$"))[.], 
  select(df_fac, matches("author.\\d$"))[.])
})

投掷Error in bind_rows_(x, .id) : not compatible with STRSXP

df_test <- pmap_chr(list(is.na(select(df_fac, matches("author.\\d$"))), 
                         select(df_all, matches("author.\\d$")), 
                         select(df_fac, matches("author.\\d$"))), 
                    ifelse)

投掷Error: Element 2 has length 2, not 1 or 10.

我需要使用该matches函数,因为实际数据中有很多作者列与相似的变量名混合在一起。如果不清楚,我可以澄清一下。谢谢你。

4

1 回答 1

3

您可以使用map2_dffor 同时循环两个列表。使用dplyr::coalsece将有助于替换缺失值。我曾经select确保其中的列df_alldf_fac.

map2_df(df_fac, select(df_all, one_of(names(df_fac))), ~coalesce(.x, .y))

同样的事情使用pmap

pmap_df(list(df_fac, select(df_all, one_of(names(df_fac)))), coalesce)

您也可以使用ifelsewithmap2和公式表示法来引用您正在使用的两个不同列表。

map2_df(df_fac, select(df_all, one_of(names(df_fac))), 
       ~ifelse(is.na(.x), .y, .x))
于 2017-01-17T19:14:36.900 回答