1
a1 = [14, 12, 11, 9, 9, 8, 8]
a2 = [12, 13, 14, 9, 9, 8]
...
...

std_dev_a1 = 2.267786838
std_dev_a2 = 2.483277404
...
...

a3由a1和a2组成,

a3 = [14, 12, 11, 9, 9, 8, 8, 12, 13, 14, 9, 9, 8]
std_dev_a3 = 2.295480509

当然我不能这样做 std_dev_a3 != (std_dev_a1 * 7 + std_dev_a2 * 6) / 13

我的问题是:我只能通过std_dev_a1 和 std_dev_a2获得std_dev_a3吗?

当我在 PHP 中编写代码来计算数组的 stddev 时,就会出现问题。因为数组一直在增长,最终会耗尽内存。因此,我在每次迭代中 unset() 数组,然后问题就出来了。我从上次迭代中保存的东西是数组的平均值,stddev,数组的长度,那么是否可以计算基于旧数组加新数组的新数组的 std_dev?

4

1 回答 1

1

您无法准确计算它,因为标准偏差公式计算每个元素与平均值之间的差异。

但是您可以通过以下公式得到一个很好的近似值:

std_dev_a3 = (n1 - 1)*pow(std_dev_a1, 2) + (n2 - 1)*pow(std_dev_a2, 2)
std_dev_a3 = sqrt(std_dev_a3 / (n1 + n2 - 2))

您提到您使用这种方法是因为您有内存泄漏。

您可以避免将数据存储到频率表中的内存泄漏:

{[8] => 3, [9] => 4, ..., [14] => 2}

使用此数据结构,您可以计算标准差:

// This should be provide by your data
$freq = array(8 => 3, 9 => 4, 11 => 1, 12 => 2, 13 => 1, 14 => 2);

// Calculate mean
$mean = 0;
$n = 0;

foreach ($freq as $value => $count) {
  $mean += $value * $count;
  $n += $count;
}

$mean = $mean / $n;

// Calculate std dev
$std_dev = 0;

foreach ($freq as $value => $count) {
  $std_dev += ($count * pow($value - $mean, 2));
}

$std_dev = sqrt($std_dev/($n - 1));
于 2013-06-06T15:06:53.190 回答