2

为什么概率结果的方差足够大?

测试代码:

function probability($chances) {

    asort($chances);
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    foreach($chances as $key => $chance) {
        if($random < $chance)
            return $key;
    }

    return $key;

}

$chances['case1'] = 10;
$chances['case2'] = 30;
$chances['case3'] = 60;

$result = array();

for($i = 0; $i < 100000; $i++)
    @$result[probability($chances)]++;

asort($result);
$sum = array_sum($result);

echo "Case\tCount\tOrig\tResult\n";

foreach($result as $key => $value)
    echo "$key\t$value\t".$chances[$key]."%\t".round($value / $sum * 100)."%\n";

结果:

Case    Count   Orig    Result
case1   14913   10%     15%
case2   33099   30%     33%
Case3   51988   60%     52%

有没有可能以某种方式调整它?我试图使用mt_srand(),但它没有帮助。

信息:

$ php -v
PHP 5.3.10-1ubuntu3.2 with Suhosin-Patch (cli) (built: Jun 13 2012 17:20:55) 
Copyright (c) 1997-2012 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2012 Zend Technologies
    with Xdebug v2.1.0, Copyright (c) 2002-2010, by Derick Rethans
    with Suhosin v0.9.33, Copyright (c) 2007-2012, by SektionEins GmbH

$ uname -a
Linux desktop 3.2.0-26-generic-pae #41-Ubuntu SMP Thu Jun 14 16:45:14 UTC 2012 i686 i686 i386 GNU/Linux
4

3 回答 3

2

您的随机数生成存在缺陷。

首先,考虑删除asort呼叫。它没有做任何有用的事情,而且令人困惑(而且很慢)。您正在对数组进行 100000 次排序!最好添加对数组进行排序的前提条件(并在循环之前对其进行一次排序)或实现不需要排序的算法。

其次,您需要确保每个案例的命中概率对于每个案例都是正确的。这些是您现在的概率:

case1: 10 % (1 <= $random <= 10)
case2: 20 % (11 <= $random <= 30)
case3: 70 % (everything that didn't match previous cases)

你真正需要做的是这样的事情:

function probability($chances) {
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    $add = 0;
    foreach($chances as $key => $chance) {
        if($random <= $chance + $add)
            return $key;
        else
            $add += $chance;
    }

    return $key;
}

这将为您提供预期的结果:

case1: 10 % (1 <= $random <= 10)
case2: 30 % (11 <= $random <= 40)
case3: 60 % (41 <= $random <= 100)
于 2012-06-28T06:27:29.377 回答
1
$sum    = max($chances);

max()不求和,使用array_sum()insted

我得到了这个结果:

Case    Count   Orig    Result
case1   11068   10%     11%
case2   29672   30%     30%
case3   59260   60%     59%

通过运行此版本的代码:

<?php

function probability($chances)
{
    asort($chances);
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    foreach($chances as $key => $chance)
    {
        $random -= $chance;
        if($random <= 0)
        {
            return $key;
        }
    }

    return $key;
}

$chances['case1'] = 10;
$chances['case2'] = 30;
$chances['case3'] = 60;

$result = array();

for($i = 0; $i < 100000; $i++)
{
    @$result[probability($chances)]++;
}

asort($result);
$sum = array_sum($result);

echo "Case\tCount\tOrig\tResult\n";

foreach($result as $key => $value)
{
    echo "$key\t$value\t".$chances[$key]."%\t".round($value / $sum * 100)."%\n";
}
?>
于 2012-06-28T06:07:00.063 回答
1

首先,里面的比较probability是错误的,应该是<=而不是<

这至少应该使结果更加一致(即10、20、70)

其次,case3重复计算(如果 nr <= 60 并且如果 nr > 60)。

我建议对代码进行此更改:

function probability($chances)
{
    $sum    = array_sum($chances);
    $random = mt_rand(1, $sum);

    foreach($chances as $key => $chance) {
        if ($random <= $chance) {
            return $key;
        }
    }

    return 'rest';
}

$chances然后在数组中添加“rest” 。这必须按排序顺序出现。

$chances['case1'] = 10;
$chances['case2'] = 30;
$chances['case3'] = 60;
$chances['rest'] = 'NA'; // for 60 < x <= 100

结果:

Case    Count   Orig    Result
case1   10083   10%     10%
case2   19965   30%     20%
case3   30084   60%     30%
rest    39868   NA%     40%
于 2012-06-28T06:39:40.923 回答