r - How to combine geom_bar for three dataframe?

Question

Suppose I have:

a = data.frame(a = sample(1:10, 20, replace = T))
b = data.frame(b = sample(1:11, 19, replace = T))
c = data.frame(c = sample(1:9, 21, replace = T))

a.a = ggplot(data = a, aes(a)) + geom_bar()
b.b = ggplot(data = b, aes(b)) + geom_bar()
c.c = ggplot(data = c, aes(c)) + geom_bar()

How can I combine a.a, b.b and c.c into one plot? Like

I have tried

d = ggplot() + 
  geom_bar(data = a.a, aes(a)) +
  geom_bar(data = b.b, aes(b)) +
  geom_bar(data = c.c, aes(c))
d

But it doesn't work...

score 1 · Accepted Answer

Combine them into a single "long" data frame that has a grouping column marking which data frame each row came from.

library(reshape2)
library(dplyr)

# Individual data frames
a = data.frame(a = sample(1:10, 20, replace = T))
b = data.frame(b = sample(1:11, 19, replace = T))
c = data.frame(c = sample(1:9, 21, replace = T))

Combine data frames in "long" format. The data frames have different numbers of rows, so we need our new grouping variable (called data_source below) to repeat each data frame's name a number of times equal to the number of rows in each data frame. We use the rep function to take care of this. One way is as follows: rep(c("a","b","c"), times=c(nrow(a), nrow(b), nrow(c))), however, I use sapply below because is seemed cleaner (though perhaps more opaque).

df = data.frame(value =c(a$a,b$b,c$c), 
                data_source=rep(c("a","b","c"), times=sapply(list(a,b,c), nrow)))

# Pre-summarise counts in order to add zero counts for empty categories
df.summary = df %>% group_by(data_source, value) %>%
  tally %>%
  dcast(data_source ~ value, value.var="n", fill=0) %>%
  melt(id.var="data_source", variable.name="value", value.name="n")

ggplot(df.summary, aes(value, n, fill=data_source)) + 
  geom_bar(stat="identity", position="dodge", colour="grey20", lwd=0.3)

If we didn't have some categories with zero counts (for example, data frames b and c have no values equal to 10), then we could just do this:

ggplot(df, aes(factor(value), fill=data_source)) + 
  geom_bar(position="dodge", colour="grey20", lwd=0.3)

But then note how ggplot expands the remaining bars when one or two data frames don't contain a given value:

r - How to combine geom_bar for three dataframe?

1 回答 1

Related

Reference