我正在尝试获取跨行某些列的 mean() 和 sum()。此代码将生成数据集:
library(tidyverse)
test_data <- tibble(part_id = 1:5,
a_1 = c("a", "b", "c", "d", "a"),
a_2 = c("b", NA, "b", "a", "d"),
a_3 = c("b", "b", "d", "d", "a"))
test_data <- test_data %>%
mutate_at(vars(a_1, a_2), .funs = list(scored = ~case_when(
. == "a" | . == "b" ~ 1,
. == "c" ~ 0,
. == "d" ~ -100)))
如果我尝试使用 rowSums() 或 rowMeans(),我会得到正确答案:
library(tidyverse)
test_data <- test_data %>%
mutate(a_total = rowSums(dplyr::select(., contains("scored")), na.rm = TRUE),
a_mean = rowMeans(dplyr::select(., contains("scored")), na.rm = TRUE))
但是,如果尝试使用 rowwise() 后跟 sum() 或 mean(),它不起作用:
library(tidyverse)
test_data <- test_data %>%
rowwise() %>%
mutate(a_total = base::sum(dplyr::select(., contains("scored")), na.rm = TRUE),
a_mean = base::mean(dplyr::select(., contains("scored")), na.rm = TRUE)) %>%
ungroup()
对于 sum(),它给出了总和,有效地忽略了 rowwise(),对于 mean(),所有答案都是 NA,我对每一行都收到以下警告:
Warning messages:
1: In mean.default(dplyr::select(., contains("scored")), na.rm = TRUE) :
argument is not numeric or logical: returning NA
我还尝试通过包含 c() 函数进行此修改,就像您要列出每一列一样。这导致了以下错误:
library(tidyverse)
test_data <- test_data %>%
rowwise() %>%
mutate(a_total = base::sum(c(dplyr::select(., contains("scored"))), na.rm = TRUE),
a_mean = base::mean(c(dplyr::select(., contains("scored"))), na.rm = TRUE)) %>%
ungroup()
Error in base::sum(c(dplyr::select(., contains("scored"))), na.rm = TRUE) :
invalid 'type' (list) of argument
如何使用 rowwise() 完成这项工作?为什么它的行为与典型的和 rowSums() 或 rowMeans() 如此不同?
我很感激任何见解!