31

The following code produces bar plots with standard error bars using Hmisc, ddply and ggplot:

means_se <- ddply(mtcars,.(cyl),
                  function(df) smean.sdl(df$qsec,mult=sqrt(length(df$qsec))^-1))
colnames(means_se) <- c("cyl","mean","lower","upper")
ggplot(means_se,aes(cyl,mean,ymax=upper,ymin=lower,group=1)) + 
  geom_bar(stat="identity") +  
  geom_errorbar()

However, implementing the above using helper functions such as mean_sdl seems much better. For example the following code produces a plot with 95% CI error bars:

ggplot(mtcars, aes(cyl, qsec)) + 
  stat_summary(fun.y = mean, geom = "bar") + 
  stat_summary(fun.data = mean_sdl, geom = "errorbar")

My question is how to use the stat_summary implementation for standard error bars. The problem is that to calculate SE you need the number of observations per condition and this must be accessed in mean_sdl's multiplier.

How do I access this information within ggplot? Is there a neat non-hacky solution for this?

4

1 回答 1

66

好吧,我不能告诉你如何按组将乘数放入stat_summary.

但是,看起来您的目标是绘制均值和误差条,它们代表一个标准误差,ggplot而不是在绘制之前汇总数据集。

ggplot2中有一个mean_se函数,我们可以使用它来代替Hmisc。该函数的默认乘数为 1,因此如果我们想要标准误差线,我们不需要传递任何额外的参数。mean_cl_normalmean_se

ggplot(mtcars, aes(cyl, qsec)) + 
    stat_summary(fun.y = mean, geom = "bar") + 
    stat_summary(fun.data = mean_se, geom = "errorbar")

如果要使用mean_cl_normal函数 from Hmisc,则必须将乘数更改为 1,以便从平均值中得到一个标准误差。mult论据是 的论据mean_cl_normal。您需要传递给正在使用的汇总函数的参数需要作为fun.args参数列表提供:

ggplot(mtcars, aes(cyl, qsec)) + 
    stat_summary(fun.y = mean, geom = "bar") + 
    stat_summary(fun.data = mean_cl_normal, geom = "errorbar", fun.args = list(mult = 1))

ggplot2的 pre-2.0 版本中,可以直接传递参数:

ggplot(mtcars, aes(cyl, qsec)) + 
  stat_summary(fun.y = mean, geom = "bar") + 
  stat_summary(fun.data = mean_cl_normal, geom = "errorbar", mult = 1) 
于 2013-10-10T14:47:13.437 回答