制作每组只有 1 或 2 个值的箱线图可能会误导每个总体的真实方差。但只是为了演示代码,您可以执行以下操作:
# load necessary packages
library(tidyverse)
# to reproduce sampling rows
set.seed(1)
# produce boxplot (not recommended for small samples)
iris %>%
select(Species, Sepal.Length, Sepal.Width) %>%
pivot_longer(-Species) %>%
group_by(Species, name) %>%
sample_n(size = 2, replace = FALSE) %>%
ggplot(aes(x = name, y = value, fill = Species)) +
geom_boxplot() +
coord_flip()
这产生了这个情节:

在实践中,当样本量相当小(例如 n < 10)时,显示单个数据点可能会提供更多信息,可能带有一些汇总统计数据,例如平均值或中位数。以下是我更倾向于用样本大小 = 2 表示数据的方式:
# to reproduce sampling rows
set.seed(1)
# produce bar plot with overlaid points (recommended for small samples)
iris %>%
select(Species, Sepal.Length, Sepal.Width) %>%
pivot_longer(-Species) %>%
group_by(Species, name) %>%
sample_n(size = 2, replace = FALSE) %>%
ggplot(aes(x = name, y = value, fill = Species)) +
stat_summary(fun = mean, geom = "bar", position = "dodge") +
geom_point(shape = 21, size = 3, position = position_dodge(width = 0.9)) +
coord_flip()
这给出了这个情节:
