2

EDIT: rephrased question for clarity of what I was wanting to achieve.

I have an observed dataset from which I want to use some information to feed into a Monte Carlo simulation. I'm using R for this study.

e.g. 8/8 individuals have a particular characteristic in my observed dataset.

What I want to do is use the sampling distribution from this observed data to choose some possible population proportions to feed into a random number generator, whereby I can then generate some simulated counts (where I also need to use a larger denominator).

The observed data and the 95% confidence interval are as follows:

binom.test(8, 8)
## gives point estimate of 1 and 95% CI 0.63, 1

I would then want to take (e.g.) 1000 random draws from this sampling distribution to feed into a random binary outcome generator for a larger denominator (e.g. 12 trials per iteration). Let’s say the first random draw was a 0.75 chance of having an event (code below is just illustrating a single iteration):

set.seed(456)    
rbinom(1, 12, 0.75)
## Gives a count of 11 events out of 12 for this single iteration.

My question then is how to get R to draw the probabilities from the observed data’s sampling distribution (i.e. 95% of these drawn probabilities should fall between 0.63 and 1, with a shape as defined by the underlying statistical theory), which I can then use to generate random counts with a larger denominator (probably using rbinom).

EDIT: My original post was more convoluted and confusing: I hadn’t fully thought through the implications of rbinom using a population parameter, even though I was pretty sure that this was the source of my "problem" with rbinom. Thanks to DavidRobinson and DWin for comments/answers that clarified my answer as well as my revised question...

4

2 回答 2

1

你很困惑......因为你的第一个问题是胡说八道......这是讨论的错误地点。有许多理论总体可以合理甚至不合理地从二项式总体中得出一系列观察到的伯努利图 8/8。假设你在一个瓮中有 99 个黑球和一个白球。在 8 次平局中得到 8/8 个黑球并进行替换是合理的。这种序列的概率为 (99/100)^8 = 0.923

这段代码展示了它在 R“实践”中是如何工作的

> set.seed(123)
> sum(rbinom(10000, 8, .99)==8)
[1] 9263

所以在这个模拟中,8 个平局序列中有 92.63% 的 8 个球都是黑色的。现在重新考虑你在问什么,并提出更多这样的问题(在 stats.stackexchange。

于 2012-12-18T08:41:26.440 回答
1

这个答案是根据@DavidRobinson(谢谢!)的评论提出的,他建议根据我观察到的数据对概率进行后验分布。

代码改编页。42 of Hoff, PD (2009), A First Course in Bayesian Statistics , Springer, NY。

## Set a uniform prior.
a <- 1; b <- 1
## Set observed data.
n <- 8; y <- 8

## Posterior 95% confidence interval:
qbeta(c(.025, .975), a+y, b+n-y)
## returns [1] 0.6637329 0.9971909

这非常接近基于二项分布的置信区间,由于先验的影响略有不同。

binom.test(8, 8)
## returns  95% CI of 0.6305834 1.0000000.

现在我可以从这个后验分布中绘制一组随机概率来生成一些计数。我将在这里仅使用五张画作说明。

set.seed(9876)
n.draws <- 5

## Use rbeta to get n.draws from posterior distribution.
drawn.probs <- rbeta(n.draws, a+y, b+n-y)

## Now I can use these drawn probabilities in rbinom to get simulated counts.
rbinom(n.draws, 12, drawn.probs)

感谢您的评论/回答——这让我意识到这不仅仅是我在尝试使用 rbinom 时遇到的问题,而是我错过了一个中间步骤。

于 2012-12-18T21:11:09.697 回答