我有一个大数组,但结构类似于:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]]
在不展平阵列的情况下,采用 5 个元素的滚动平均值的最佳、最有效的方法是什么。IE
值一将是 (0+1+2+3+4)/5=2
值二是 (1+2+3+4+5)/5=3
值三将是 (2+3+4+5+6)/5=4
谢谢
执行此操作的“最佳”方法可能是将数组的视图提交到uniform_filter
. 我不确定这是否会破坏您的“无法展平阵列”,但如果不以某种方式重塑阵列,所有这些方法都会比以下方法慢得多:
import numpy as np
import scipy.ndimage.filters as filt
arr=np.array([[0,1,2,3,4],
[5,6,7,8,9],
[10,11,12,13,14],
[15,16,17,18,19]])
avg = filt.uniform_filter(arr.ravel(), size=5)[2:-2]
print avg
[ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17]
print arr.shape #Original array is not changed.
(4, 5)
The "best" solution will depend on the reason why you don't want to flatten the array in the first place. If the data are contiguous in memory, using stride tricks is an efficient way to compute a rolling average without explicitly flattening the array:
In [1]: a = np.arange(20).reshape((4, 5))
In [2]: a
Out[2]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
In [3]: from numpy.lib.stride_tricks import as_strided
In [4]: s = a.dtype.itemsize
In [5]: aa = as_strided(a, shape=(16,5), strides=(s, s))
In [6]: np.average(aa, axis=1)
Out[6]:
array([ 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12.,
13., 14., 15., 16., 17.])
伪代码(虽然它可能看起来有点像 Python):
for i = 0 to 15:
sum = 0
for j from 0 to 4:
// yourLists[m][n] is the nth element of your mth list (zero-indexed)
sum = sum + yourLists [ (i + j) / 5 ] [ (i + j) % 5 ]
next j
print i, sum/5
next i
您可能会做得更好,不要每次都添加所有五个数字。
注意:这个答案并不 numpy
具体。
如果列表列表可以展平,这将更简单。
from itertools import tee
def moving_average(list_of_list, window_size):
nums = (n for l in list_of_list for n in l)
it1, it2 = tee(nums)
window_sum = 0
for _ in range(window_size):
window_sum += next(it1)
yield window_sum / window_size
for n in it1:
window_sum += n
window_sum -= next(it2)
yield window_sum / window_size