17

我正在尝试使用一组介于 1000 之间的随机数在 R 中创建密度曲线,并对小于或等于某个值的部分进行着色。有很多涉及geom_areaor的解决方案geom_ribbon,但它们都需要 a yval,而我没有(它只是 1000 个数字的向量)。关于我如何做到这一点的任何想法?

另外两个相关问题:

  1. 是否可以对累积密度函数(我目前stat_ecdf用来生成一个)做同样的事情,或者完全遮蔽它?
  2. 有什么方法可以编辑geom_vline,所以它只会上升到密度曲线的高度,而不是整个 y 轴?

代码:(geom_area编辑我发现的一些代码的尝试失败。如果我ymax手动设置,我只会得到一个占据整个图的列,而不仅仅是曲线下的区域)

set.seed(100)

amount_spent <- rnorm(1000,500,150)
amount_spent1<- data.frame(amount_spent)
rand1 <- runif(1,0,1000)
amount_spent1$pdf <- dnorm(amount_spent1$amount_spent)

mean1 <- mean(amount_spent1$amount_spent)

#density/bell curve
ggplot(amount_spent1,aes(amount_spent)) +
   geom_density( size=1.05, color="gray64", alpha=.5, fill="gray77") +
   geom_vline(xintercept=mean1, alpha=.7, linetype="dashed", size=1.1, color="cadetblue4")+
   geom_vline(xintercept=rand1, alpha=.7, linetype="dashed",size=1.1, color="red3")+
   geom_area(mapping=aes(ifelse(amount_spent1$amount_spent > rand1,amount_spent1$amount_spent,0)), ymin=0, ymax=.03,fill="red",alpha=.3)+
   ylab("")+ 
   xlab("Amount spent on lobbying (in Millions USD)")+
   scale_x_continuous(breaks=seq(0,1000,100))
4

1 回答 1

19

有几个问题表明了这一点......这里这里,但他们在绘图之前计算密度。

这是另一种方式,比我需要的更复杂,可以ggplot为您进行一些计算。

# Your data
set.seed(100)
amount_spent1 <- data.frame(amount_spent=rnorm(1000, 500, 150))

mean1 <- mean(amount_spent1$amount_spent)
rand1 <- runif(1,0,1000)

基本密度图

p <- ggplot(amount_spent1, aes(amount_spent)) +
          geom_density(fill="grey") +
          geom_vline(xintercept=mean1) 

您可以使用 提取要从绘图对象中着色的区域的x和位置。线性插值用于获得值yggplot_buildyx=rand1

# subset region and plot
d <- ggplot_build(p)$data[[1]]

p <- p + geom_area(data = subset(d, x > rand1), aes(x=x, y=y), fill="red") +
          geom_segment(x=rand1, xend=rand1, 
                       y=0, yend=approx(x = d$x, y = d$y, xout = rand1)$y,
                       colour="blue", size=3)

在此处输入图像描述

于 2015-07-04T02:25:14.997 回答