2

我在 R 中有一个名为的数据框,x它有数百行。每一行都是一个人。我有两个变量,Height一个是连续的,Country一个是一个因子。我想绘制个人所有高度的平滑直方图。我想把它分层Country。我知道我可以使用以下代码做到这一点:

library(ggplot2)
ggplot(x, aes(x=Height, colour = (Country == "USA"))) + geom_density()

这将来自美国的每个人绘制为一种颜色(真),将来自任何其他国家的每个人绘制为另一种颜色(假)。但是,我真正想做的是将来自美国的每个人绘制成一种颜色,并将来自阿曼、尼日利亚和瑞士的每个人绘制成另一种颜色。我将如何调整我的代码来做到这一点?

4

2 回答 2

4

I made up some data for illustration:

head(iris)
table(iris$Species)
df <- iris
df$Species2 <- ifelse(df$Species == "setosa", "blue", 
               ifelse(df$Species == "virginica", "red", ""))

library(ggplot2)
p <- ggplot(df, aes(x = Sepal.Length, colour = (Species == "setosa")))
p + geom_density() # Your example

example with true and false

# Now let's choose the other created column
p <- ggplot(df, aes(x = Sepal.Length, colour = Species2))
p + geom_density() + facet_wrap(~Species2)

example with extra column Edit to get rid of the "countries" that you don't want in the plot, just subset them out of the data frame you use in the plot (note that the labels with the colours don't exactly match but that can be changed within the data frame itself):

p <- ggplot(df[df$Species2 %in% c("blue", "red"),], aes(x = Sepal.Length, colour = Species2))
p + geom_density() + facet_wrap(~Species2)

example with filtered data frame And to overlay the lines just take out the facet_wrap:

p + geom_density() 

example without facet_wrap

于 2015-03-05T15:48:40.593 回答
0

我喜欢通过上面的出色答案工作。这是我的模组。

df <- iris
df$Species2 <- ifelse(df$Species == "setosa", "blue", 
           ifelse(df$Species == "virginica", "red", ""))
homes2006 <- df

names(homes2006)[names(homes2006)=="Species"] <- "ownership"
homes2006a <- as.data.frame(sapply(homes2006, gsub, 
                               pattern ="setosa",                                         replacement = "renters"))
homes2006b <- as.data.frame(sapply(homes2006a, gsub,                                       pattern = "virginica", 
                        replacement = "home-owners"))
homes2006c <- as.data.frame(sapply(homes2006b, gsub,                                       pattern = "versicolor", 
                        replacement = "home-owners"))

##somehow sepal-length became a factor column
homes2006c[,1] <- as.numeric(homes2006c[,1])

library(ggplot2)

p <- ggplot(homes2006c, aes(x = Sepal.Length, 
           colour = (ownership == "home-owners")))

p + ylab("number of households") +
xlab("monthly income (NIS)") +
ggtitle("income distribution by home ownership") +
geom_density()

在此处输入图像描述

于 2017-07-11T19:59:34.110 回答