1

I have some data from a R course class. The professor was adding each line kind of manually using base graphics. I'd like to do it using ggplot2.

So far I've created a facet'd plot in ggplot with scatter plots of hunger in different regions and also separately fitted a model to the data. The specific model has interaction terms between the x variable in the plot and the group/colour variable.

What I want to do now is plot the lines resulting for that model one per panel. I could do this by using geom_abline and defining the slope and the intercept as the sum of 2 of the coefficients (as the categorical variables for group have 0/1 values and in each panel only some values are multiplied by 1) - but this seems not easy.

I tried the same equation I used in lm in stat_smooth with no luck, I get an error.

Ideally, I'd think one can put the equation somehow into the stat_smooth and have ggplot do all the work. How would one go about it?

download.file("https://sparkpublic.s3.amazonaws.com/dataanalysis/hunger.csv", 
                  "hunger.csv", method = "curl")
hunger <- read.csv("hunger.csv")
hunger <- hunger[hunger$Sex!="Both sexes",]
hunger_small <- hunger[hunger$WHO.region!="WHO Non Members",c(5,6,8)]
q<- qplot(x = Year, y = Numeric, data = hunger_small, 
            color = WHO.region) + theme(legend.position = "bottom")
q <- q + facet_grid(.~WHO.region)+guides(col=guide_legend(nrow=2))
q

 # I could add the standard lm line from stat_smooth, but I dont want that
 #  q <- q + geom_smooth(method="lm",se=F)

#I want to add the line(s) from the lm fit below, it is really one line per panel
lmRegion <- lm(hunger$Numeric ~ hunger$Year + hunger$WHO.region + 
                  hunger$Year *hunger$WHO.region)

# I also used a loop to do it, as below, but all in one panel
# I am not able to do that
# with facets, I used a function I found to get the colors 

ggplotColours <- function(n=6, h=c(0, 360) +15) {
  if ((diff(h)%%360) < 1) h[2] <- h[2] - 360/n
  hcl(h = (seq(h[1], h[2], length = n)), c = 100, l = 65)
}

n <- length(levels(hunger_small$WHO.region))
q <- qplot(x = Year, y = Numeric, data = hunger_small, 
         color = WHO.region) + theme(legend.position = "bottom")
q <- q + geom_abline(intercept = lmRegion$coefficients[1], 
         slope = lmRegion$coefficients[2], color = ggplotColours(n=n)[1])
for (i in 2:n) {
  q <- q +  geom_abline(intercept = lmRegion$coefficients[1] + 
            lmRegion$coefficients[1+i], slope = lmRegion$coefficients[2] + 
              lmRegion$coefficients[7+i], color = ggplotColours(n=n)[i])
}
4

1 回答 1

0

如果您有一个分类数据:

geom_point()

不管用,

geom_boxplot()

将工作。

ggplot(hunger, aes(x = sex, y = hunger)) + geom_boxplot() + labs(x="sex") + geom_smooth(method = "lm",se=FALSE, col = "blue"). Susy
于 2017-08-09T02:52:40.753 回答