2

You have an empty ice cube tray which has n little ice cube buckets, forming a natural hash space that's easy to visualize.

Your friend has k pennies which he likes to put in ice cube trays. He uses a random number generator repeatedly to choose which bucket to put each penny. If the bucket determined by the random number is already occupied by a penny, he throws the penny away and it is never seen again.

Say your ice cube tray has 100 buckets (i.e, would make 100 ice cubes). If you notice that your tray has c=80 pennies, what is the most likely number of pennies (k) that your friend had to start out with?

If c is low, the odds of collisions are low enough that the most likely number of k == c. E.g. if c = 3, then it's most like that k was 3. However, the odds of a collision are increasingly likely, after say k=14 then odds are there should be 1 collision, so maybe it's maximally likely that k = 15 if c = 14.

Of course if n == c then there would be no way of knowing, so let's set that aside and assume c < n.

What's the general formula for estimating k given n and c (given c < n)?

4

1 回答 1

0

目前的问题是不适定的。

设 n 为托盘数。
让 X 成为您朋友开始时的便士数量的随机变量。
令 Y 为填充托盘数量的随机变量。

您要的是分布 P(X|Y=c) 的模式。
(或者可能期望 E[X|Y=c] 取决于您如何解释您的问题。)

让我们举一个非常简单的例子:分布 P(X|Y=1)。然后

P(X=k|Y=1) = (P(Y=1|X=k) * P(X=k)) / P(Y=1)
= (1/n k-1 * P(X= k)) / P(Y=1)

由于 P(Y=1) 是归一化常数,我们可以说 P(X=k|Y=1) 与 1/n k-1 * P(X=k) 成正比。

但是 P(X=k) 是先验概率分布。您必须假设您的朋友必须开始使用的硬币数量的一些概率分布。

例如,这里有两个我可以选择的先验:

  1. 我先前的信念是,对于 k > 0 ,P(X=k) = 1/2 k 。
  2. 我先前的信念是 P(X=k) = 1/2 k - 100,对于 k > 100。

两者都是有效的先验;第二个假设 X > 100。两者都会对 X 给出截然不同的估计:先验 1 会估计 X 大约为 1 或 2;之前的 2 将估计 X 为 100。

我建议如果您继续追查这个问题,您只需继续选择一个先前的问题。像这样的东西会很好地工作:WolframAlpha。这是支持 k > 0 且均值为 10^4 的几何分布。

于 2014-02-01T06:04:43.220 回答