2

让我从一个简单的例子开始。

假设有 10 人的宇宙,其中 1 人拥有产品 A,2 人拥有产品 B
U=10, A=1, B=2

现在我想找到以下机会:
1)一个人没有产品==>(1 - 1/10)*(1 - 2/10)= 0.72
2)一个人至少拥有1个产品==> 1 - ((1 - 1/10) * (1 - 2/10)) = 0.28
3) 一个人拥有 2 件产品 ==> (1/10) * (2/10) = 0.02

但是,如果有n 个产品,我想要一个通用算法来对所有这些选项进行排序。

输入如下
U <- 10
products <- c('A','B')
owned_by <- c(1, 2)
df <- data.frame(products, owned_by)

4

1 回答 1

0

I think a closed form solution would involve too many terms and become overly complex. So this problem looks like a perfect candidate for Monte Carlo method.

set.seed(1984)

U <- 10
products <- c('A','B')
owned_by <- c(1,2) 
df <- data.frame(products, owned_by)
p = rep(0, nrow(df)+1)
num.runs = 1000

for(n in 1:num.runs)
{
  x=c()  ## list of people who own a product
  for (i in 1:nrow(df))
    x = c(x, sample(1:U, df$owned_by[i]))

  ## get the number of people who own 0, 1, 2...products
  p[1] = p[1] + (sum(hist(x,breaks=0:U,plot=F)$counts == 0) / U)
  for(i in 1:nrow(df))
    p[i+1] = p[i+1] + (sum(hist(x,breaks=0:U,plot=F)$counts >= i) / U)
}

p = p / num.runs ## average over all runs
p

## 0.7197 0.2803 0.0197
于 2013-05-07T16:50:15.473 回答