0

当我使用 geom_col 来绘制该州白人的百分比(来自 ggplot2 包中的中西部数据集)时,ggplot2 会添加这些值而不是平均它们。这对我来说似乎是一个非常奇怪的默认设置——我认为这不是条形图/柱形图的“作用”。我阅读了帮助文档并进行了一些谷歌搜索,但也许我没有在寻找正确的东西。

ggplot(data = midwest, mapping = aes(x = state, y = percwhite)) +
  geom_col()

该图清楚地返回了每个状态的所有值的总和。我希望它返回每个州的平均值。我使用 R 才几周,但我不敢相信我以前从未注意到这一点。

4

2 回答 2

1

问题中的代码产生“总和”,因为geom_col()默认为position = "stack".

以下是生成显示均值的图形的不同可能方法:

library(ggplot2)

# the normal way of plotting data summaries like means is to use stat_summary()
ggplot(data = midwest, mapping = aes(x = state, y = percwhite)) +
  stat_summary(geom = "col", fun = mean)

# same plot using less intuitive code (avoid if possible)
ggplot(data = midwest, mapping = aes(x = state, y = percwhite)) +
  geom_bar(stat = "summary", fun = mean)

# same plot using base R functions to pre-compute the means
means.df <- aggregate(percwhite ~ state, FUN = mean, data = midwest)

ggplot(data = means.df, mapping = aes(x = state, y = percwhite)) +
  geom_col() # one value per column, stacking has no effect

rm(means.df) # assuming it is no-longer needed

# same plot using pipes and dplyr "verbs"
library(dplyr)
midwest %>%
  group_by(state) %>%
  summarise(percwhite = mean(percwhite)) %>%
  ggplot(mapping = aes(x = state, y = percwhite)) +
  geom_col()

需要注意的是,geom_bar()和较新geom_col()的非常相似。但是,只有geom_bar()参数statfun定义。

于 2020-09-22T10:23:29.143 回答
0

首先,创建一个均值表:

myTable <- aggregate(percwhite ~ state, FUN = mean, data = midwest)

现在您可以使用该表来制作条形图:

ggplot(data = myTable, mapping = aes(x = state, y = percwhite)) +  geom_col()
于 2020-09-21T21:53:30.083 回答