1

I combined several data-frames into a data-frame dfc with a fifth column called model specifying which model was used for imputation. I want to plot the distributions by grouping them by model.

dfc looks something like: (1000 rows, 5 columns)

X1        X2        X3        X4      model
1500000 400000    0.542      7.521    actual
250000  32000     2.623     11.423   missForest
...

I use the lines below to plot:

library(lattice)
densityplot(X1 + X2 + X3 + X4, group = dfc$model)

giving:

comparison

Note that X1 <- dfc$X1 (and likewise)

My questions are:

  • How can I add a legend to this plot? (this plot is useless if one can't tell which colour belongs to which model)
  • Is there, perhaps, a more visually appealing way to plot this? Using ggplot, perhaps?
  • Is there a better way to compare these models? For example, I could plot for each column separately.
4

3 回答 3

0

使用 ggplot 的快速密度图。

library(ggplot2)
library(reshape2)
a <- rnorm(50)
b <- runif(50, min = -5, max = 5)
c <- rchisq(50, 2)

data <- data.frame(rnorm = a, runif = b, rchisq = c)
data <- melt(data) #from reshape2 package

ggplot(data) + geom_density(aes(value, color = variable)) + 
               geom_jitter(aes(value, 0, color = variable), alpha = 0.5, height = 0.02 ) 

在此处输入图像描述

备注:我添加了这个reshape2包,因为 ggplot 喜欢“长”数据,我认为你的数据是“宽”的。

单独绘制每一列会像这样工作:

ggplot(data) + geom_density(aes(value, color = variable)) 
             + geom_point(aes(value, 0, color = variable))  
             + facet_grid(.~variable)

在此处输入图像描述

这里的颜色可能是多余的,但您可以删除color参数。

于 2016-06-16T08:32:38.210 回答
0

从@alex 复制的数据

library(ggplot2)
library(reshape2)
a <- rnorm(50)
b <- runif(50, min = -5, max = 5)
c <- rchisq(50, 2)

dat <- data.frame(Hmisc = a, MICE = b, missForest = c)
dat <- melt(dat)

library(lattice) # using lattice package 
densityplot(~value,dat,groups = variable,auto.key = T)

在此处输入图像描述

个别地块

densityplot(~value|variable,dat,groups = variable,auto.key = T,scales=list(relation="free"))

在此处输入图像描述

于 2016-06-17T05:23:22.497 回答
0

我所要做的就是设置一个论点:

densityplot(X1 + X2 + X3 + X4, group = dfc$model, auto.key = TRUE)给出所需的情节

这基本上是我需要的

问题是我无法弄清楚densityplot()R 使用的是哪个。

问题的其他部分保持开放。

于 2016-06-17T05:04:44.527 回答