3

假设以下实验:执行相同的伯努利试验(成功概率为 P)N 次

我需要以下信息:所有可能的成功/失败序列及其发生概率。

示例:成功概率 P = 40% 执行 3 次的伯努利实验将产生以下结果(S 为成功,F 为失败):

FFF 0.216

SFF 0.144

FSF 0.144

SSF 0.096

FFS 0.144

SFS 0.096

FSS 0.096

不锈钢 0.064

我试图暴力破解它以获得结果,但它仅在 N = 25 的情况下迅速窒息,我得到了 OutOfMemoryException ...

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace ConsoleApplication
{
    class Program
    {
        static Dictionary<string, double> finalResultProbabilities = new Dictionary<string, double>();

        static void Main(string[] args)
        {
            // OutOfMemoryException if I set it to 25 :(
            //var nbGames = 25;
            var nbGames = 3;
            var probabilityToWin = 0.4d;

            CalculateAverageWinningStreak(string.Empty, 1d, nbGames, probabilityToWin);

            // Do something with the finalResultProbabilities data...
        }

        static void CalculateAverageWinningStreak(string currentResult, double currentProbability, int nbGamesRemaining, double probabilityToWin)
        {
            if (nbGamesRemaining == 0)
            {
                finalResultProbabilities.Add(currentResult, currentProbability);
                return;
            }

            CalculateAverageWinningStreak(currentResult + "S", currentProbability * probabilityToWin, nbGamesRemaining - 1, probabilityToWin);
            CalculateAverageWinningStreak(currentResult + "F", currentProbability * (1 - probabilityToWin), nbGamesRemaining - 1, probabilityToWin);
        }
    }
}

我需要能够及时支持最多 N = 3000(任何 P 不到 3 秒即可获得结果)

有没有一种数学方法可以最佳地做到这一点?

4

2 回答 2

2

由于您对最长连胜的长度感兴趣,因此在试验的任何时候,历史上只有两个相关事实:(1)迄今为止最长的连胜有多长时间(m)和(2)如何目前的连胜纪录是 (k)。初始状态为 (0, 0)。转换是 (m, k) -> (max(m, k + 1), k + 1) 获胜, (m, k) -> (m, 0) 失败。一旦我们知道所有最终状态的概率,我们就可以进行平均。

编辑:此代码的修订版本进行了优化,以减少非常不可能的事件所需的计算。具体来说,我们忽略长度大于或等于 some 的所有状态cutoff。给定一个绝对误差预算abserr,我们确定我们可以abserr / n从最终的期望值计算中排除概率质量,因为最长的连续可能是n。开始连胜的地方少于n,并且在每个地方,连胜的概率cutoffpwin**cutoff。使用粗略的联合界限,我们需要

n * pwin**cutoff <= abserr / n
pwin**cutoff <= abserr / n**2
log(pwin) * cutoff <= log(abserr / n**2)
cutoff >= log(abserr / n**2, pwin),

最后一个不等式反转方向,因为log(pwin) < 0.

尽管评估器质量很差(即 2014 年硬件上的 Python 3 解释器),但修改后的代码在不到三秒的时间内运行。

import math


def avglongwinstreak(n, pwin, abserr=0):
    if n > 0 and pwin < 1 and abserr > 0:
        cutoff = math.log(abserr / n**2, pwin)
    else:
        cutoff = n + 1
    dist = {(0, 0): 1}
    for i in range(n):
        nextdist = {}
        for (m, k), pmk in dist.items():
            winkey = (max(m, k + 1), k + 1)
            if winkey[0] < cutoff:
                nextdist[winkey] = nextdist.get(winkey, 0) + pmk * pwin
            losskey = (m, 0)
            nextdist[losskey] = nextdist.get(losskey, 0) + pmk * (1 - pwin)
        dist = nextdist
    return sum(m * pmk for ((m, k), pmk) in dist.items())


print(avglongwinstreak(3000, 0.4, 1e-6))
于 2017-02-09T15:11:51.207 回答
1

这是一种不同的方法,它足够精确和快速,只是二次的。最长连胜的期望值等于

 n
sum Pr(there exists a win streak of length at least k).
k=1

我们对概率的推理如下。记录要么以k连续获胜(概率pwin**k)开始,要么j以某些人获胜j in 0..k-1然后失败(概率pwin**j * (1 - pwin))开始,在这种情况下,概率等于在尝试中k连续获胜的概率。n - (j + 1)我们使用 memoization 来评估这个逻辑在 中所暗示的递归pwinstreak;更快的版本fastpwinstreak使用代数来避免重复求和。

def avglongwinstreak(n, pwin):
    return sum(fastpwinstreak(n, pwin, k) for k in range(1, n + 1))


def pwinstreak(n, pwin, k):
    memo = [0] * (n + 1)
    for m in range(k, n + 1):
        memo[m] = pwin**k + sum(pwin**j * (1 - pwin) * memo[m - (j + 1)]
                                for j in range(k))
    return memo[n]


def fastpwinstreak(n, pwin, k):
    pwink = pwin**k
    memo = [0] * (n + 1)
    windowsum = 0
    for m in range(k, n + 1):
        memo[m] = pwink + windowsum
        windowsum = pwin * windowsum + (1 - pwin) * (memo[m] - pwink *
                                                     memo[m - k])
    return memo[n]


print(avglongwinstreak(3000, 0.4))

允许错误的版本:

def avglongwinstreak(n, pwin, abserr=0):
    avg = 0
    for k in range(1, n + 1):
        p = fastpwinstreak(n, pwin, k)
        avg += p
        if (n - k) * p < abserr:
            break
    return avg


def fastpwinstreak(n, pwin, k):
    pwink = pwin**k
    memo = [0] * (n + 1)
    windowsum = 0
    for m in range(k, n + 1):
        memo[m] = pwink + windowsum
        windowsum = pwin * windowsum + (1 - pwin) * (memo[m] - pwink *
                                                     memo[m - k])
    return memo[n]


print(avglongwinstreak(3000, 0.4, 1e-6))
于 2017-02-09T23:17:11.897 回答