11

我想将 Tukey.HSD 事后测试的结果添加到ggplot2箱线图中。这个 SO 答案包含一个我想要的手动示例(即,绘图上的字母是手动添加的;共享一个字母的组是无法区分的,p>whatever)。

在此处输入图像描述

是否有基于 AOV 和 Tukey HSD 事后分析的自动功能将这些字母添加到箱线图中?

我认为编写这样的函数不会太难。它看起来像这样:

set.seed(0)
lev <- gl(3, 10)
y <- c(rnorm(10), rnorm(10) + 0.1, rnorm(10) + 3)
d <- data.frame(lev=lev, y=y)

p_base <- ggplot(d, aes(x=lev, y=y)) + geom_boxplot() 

a <- aov(y~lev, data=d)
tHSD <- TukeyHSD(a)

# Function to generate a data frame of factor levels and corresponding labels
generate_label_df <- function(HSD, factor_levels) {
  comparisons <- rownames(HSD$l)
  p.vals <- HSD$l[ , "p adj"]

  ## Somehow create a vector of letters
  labels <- # A vector of letters, one for each factor level, generated using `comparisons` and `p.vals`
  letter_df <- data.frame(lev=factor_levels, labels=labels)
  letter_df
}

# Add the labels to the plot
p_base + 
  geom_text(data=generate_label_df(tHSD), aes(x=l, y=0, label=labels))

我意识到该TukeyHSD对象有一个plot方法,并且还有另一个包(我现在似乎找不到)可以完成我在基本图形中描述的内容,但我真的更喜欢在ggplot2.

4

1 回答 1

14

您可以使用“multcompView”包中的“multcompLetters”在 Tukey HSD 测试后生成同源组的字母。从那里,提取与 Tukey HSD 中测试的每个因素相对应的组标签,以及箱线图中显示的上分位数,以便将标签放置在该级别之上。

library(plyr)
library(ggplot2)
library(multcompView)

set.seed(0)
lev <- gl(3, 10)
y <- c(rnorm(10), rnorm(10) + 0.1, rnorm(10) + 3)
d <- data.frame(lev=lev, y=y)

a <- aov(y~lev, data=d)
tHSD <- TukeyHSD(a, ordered = FALSE, conf.level = 0.95)

generate_label_df <- function(HSD, flev){
 # Extract labels and factor levels from Tukey post-hoc 
 Tukey.levels <- HSD[[flev]][,4]
 Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
 plot.labels <- names(Tukey.labels[['Letters']])

 # Get highest quantile for Tukey's 5 number summary and add a bit of space to buffer between    
 # upper quantile and label placement
    boxplot.df <- ddply(d, flev, function (x) max(fivenum(x$y)) + 0.2)

 # Create a data frame out of the factor levels and Tukey's homogenous group letters
  plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
     stringsAsFactors = FALSE)

 # Merge it with the labels
   labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)

return(labels.df)
}

生成ggplot

 p_base <- ggplot(d, aes(x=lev, y=y)) + geom_boxplot() +
  geom_text(data = generate_label_df(tHSD, 'lev'), aes(x = plot.labels, y = V1, label = labels))

带有自动 Tukey HSD 组标签放置的箱线图

于 2013-09-13T01:54:45.600 回答