如果我想使用mtcars
数据集获取所有数字列的平均值和总和,我将使用以下代码:
group_by(gear) %>%
summarise(across(where(is.numeric), list(mean = mean, sum = sum)))
但是,如果我在某些列中缺少值,我该如何考虑呢?这是一个可重现的示例:
test.df1 <- data.frame("Year" = sample(2018:2020, 20, replace = TRUE),
"Firm" = head(LETTERS, 5),
"Exporter"= sample(c("Yes", "No"), 20, replace = TRUE),
"Revenue" = sample(100:200, 20, replace = TRUE),
stringsAsFactors = FALSE)
test.df1 <- rbind(test.df1,
data.frame("Year" = c(2018, 2018),
"Firm" = c("Y", "Z"),
"Exporter" = c("Yes", "No"),
"Revenue" = c(NA, NA)))
test.df1 <- test.df1 %>% mutate(Profit = Revenue - sample(20:30, 22, replace = TRUE ))
test.df_summarized <- test.df1 %>% group_by(Firm) %>% summarize(across(where(is.numeric)), list(mean = mean, sum = sum)))
如果我只将summarize
每个变量分开,我可以使用以下内容:
test.df1 %>% group_by(Firm) %>% summarize(Revenue_mean = mean(Revenue, na.rm = TRUE,
Profit_mean = mean(Profit, na.rm = TRUE)
但是我想弄清楚如何将上面编写的代码调整为我mtcars
在此处提供的示例数据集。