8

This question showed how to make a qqplot with a qqline in ggplot2, but the answer only seems to work when plotting the entire dataset in a single graph.

I want a way to quickly compare these plots for subsets of my data. That is, I want to make qqplots with qqlines on a graph with facets. So in the following example, there would be lines for all 9 plots, each with their own intercept and slope.

df1 = data.frame(x = rnorm(1000, 10),
                 y = sample(LETTERS[1:3], 100, replace = TRUE),
                 z = sample(letters[1:3], 100, replace = TRUE))

ggplot(df1, aes(sample = x)) +
  stat_qq() +
  facet_grid(y ~ z)

facet data

4

2 回答 2

8

你可以试试这个:

library(plyr)

# create some data
set.seed(123)
df1 <- data.frame(vals = rnorm(1000, 10),
                  y = sample(LETTERS[1:3], 1000, replace = TRUE),
                  z = sample(letters[1:3], 1000, replace = TRUE))

# calculate the normal theoretical quantiles per group
df2 <- ddply(.data = df1, .variables = .(y, z), function(dat){
             q <- qqnorm(dat$vals, plot = FALSE)
             dat$xq <- q$x
             dat
}
)

# plot the sample values against the theoretical quantiles
ggplot(data = df2, aes(x = xq, y = vals)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  xlab("Theoretical") +
  ylab("Sample") +
  facet_grid(y ~ z)

在此处输入图像描述

于 2013-10-25T23:25:36.670 回答
4

没有充分的理由,这dplyr是同一件事的(在提出这个问题时不存在)版本。为了同行评审和比较,我将提供生成数据集的代码,以便您可以进一步检查它们。

# create some data
set.seed(123)
df1 <- data.frame(vals = rnorm(10, 10),
                  y = sample(LETTERS[1:3], 1000, replace = TRUE),
                  z = sample(letters[1:3], 1000, replace = TRUE))

#* Henrik's plyr version
library(plyr)
df2 <- plyr::ddply(.data = df1, .variables = .(y, z), function(dat){
             q <- qqnorm(dat$vals, plot = FALSE)
             dat$xq <- q$x
             dat
}
)

detach("package:plyr")


#* The dplyr version
library(dplyr)
qqnorm_data <- function(x){
  Q <- as.data.frame(qqnorm(x, plot = FALSE))
  names(Q) <- c("xq", substitute(x))
  Q
}

df3 <- df1 %>%
  group_by(y, z) %>%
      do(with(., qqnorm_data(vals)))

绘图可以使用来自 Henrik 的相同代码完成。

于 2015-09-02T17:23:24.113 回答