3

我正在尝试根据不同列中的字​​符串匹配有条件地替换多列中的值,但我希望能够使用 cross() 函数在一行代码中执行此操作,但我不断收到不正确的错误对我来说不太有意义。我觉得这可能是一个简单的解决方案,所以如果有人能指出我正确的方向,那就太棒了!

df <- data.frame("type" = c("Park", "Neighborhood", "Airport", "Park", "Neighborhood", "Neighborhood"),
               "total" = c(34, 56, 75, 89, 21, 56),
               "group_a" = c(30, 26, 45, 60, 3, 46),
               "group_b" = c(4, 30, 30, 29, 18, 10))

# working but not concise
df %>%
  mutate(total = ifelse(str_detect(type, "Park"), NA, total),
         group_a = ifelse(str_detect(type, "Park"), NA, group_a),
         group_b = ifelse(str_detect(type, "Park"), NA, group_b))

  
# concise but not working
df %>% mutate(across(total, group_a, group_b), ifelse(str_detect(type, "Park"), NA, .))

更新

我们得到了一个适用于我的虚拟数据集但不适用于我的真实数据的解决方案,因此我将分享我的真实数据框的一小段,其中数字已更改并隐藏了组织名称。当我对这些数据运行这行代码 ( df %>% mutate(across(c(Attempts, Canvasses, Completes)), ~ifelse(str_detect(long_name, "park-cemetery"), NA, .))) 时,我收到以下错误消息:

错误:mutate()输入有问题..2。x 输入..2必须是向量,而不是formula对象。i 输入..2~ifelse(str_detect(long_name, "park-cemetery"), NA, .)

这是产生此错误的一小部分数据样本:

df <- structure(list(Org = c("OrgName", "OrgName", "OrgName", "OrgName", 
"OrgName", "OrgName", "OrgName", "OrgName", "OrgName", "OrgName"
), nCode = c("M34", "R36", "R46", "X29", "M31", "K39", "Q12", 
"Q39", "X41", "K27"), Attempts = c(100, 100, 100, 100, 100, 100, 
100, 100, 100, 100), Canvasses = c(80, 80, 80, 80, 80, 80, 80, 
80, 80, 80), Completes = c(50, 50, 50, 50, 50, 50, 50, 50, 50, 
50), van_nocc_id = c(999, 999, 999, 999, 999, 999, 999, 999, 
999, 999), van_name = c("M-Upper West Side", "SI-Rosebank", "SI-Tottenville", 
"BX-park-cemetery-etc-Bronx", "M-Stuyvesant Town-Cooper Village", 
"BK-Kensington", "Q-Broad Channel", "Q-Lindenwood", "BX-Wakefield", 
"BK-East New York"), boro_short = c("M", "SI", "SI", "BX", "M", 
"BK", "Q", "Q", "BX", "BK"), long_name = c("Upper West Side", 
"Rosebank", "Tottenville", "park-cemetery-etc-Bronx", "Stuyvesant Town-Cooper Village", 
"Kensington", "Broad Channel", "Lindenwood", "Wakefield", "East New York"
)), row.names = c(NA, -10L), class = "data.frame")

最终更新

错位右括号的诅咒!感谢大家的帮助......正确的解决方案是df %>% mutate(across(c(Attempts, Canvasses, Completes), ~ifelse(str_detect(long_name, "park-cemetery"), NA, .)))

4

3 回答 3

3

如果您使用新引入的功能across(这是处理此任务的正确方法),您必须在其内部 across指定要应用的功能。在这种情况下,函数ifelse(...)必须是 purrr 样式的 lambda(所以以 开头~)。查看across 文档并查找参数.cols.fns.

df %>% 
  mutate(across(c(total, group_a, group_b), ~ifelse(str_detect(type, "Park"), NA, .)))

输出

#           type total group_a group_b
# 1         Park    NA      NA      NA
# 2 Neighborhood    56      26      30
# 3      Airport    75      45      30
# 4         Park    NA      NA      NA
# 5 Neighborhood    21       3      18
# 6 Neighborhood    56      46      10
于 2020-06-25T15:15:06.700 回答
2

这是一个 data.table 解决方案。

require(data.table)
df <- data.frame("type" = c("Park", "Neighborhood", "Airport", "Park", "Neighborhood", "Neighborhood"),
               "total" = c(34, 56, 75, 89, 21, 56),
               "group_a" = c(30, 26, 45, 60, 3, 46),
               "group_b" = c(4, 30, 30, 29, 18, 10))

setDT(df)
df[type == "Park", c("total", "group_a", "group_b") := NA]
于 2020-06-25T15:18:52.453 回答
0

更新:很快就弄清楚了!只需将列放在向量中:

# concise AND working!
df %>% mutate(across(c(total, group_a, group_b)), ifelse(str_detect(type, "Park"), NA, .))

我最初尝试过,但将列放在引号中......不要那样做:)

于 2020-06-25T15:15:11.077 回答