一种方法是一次性获得所有元素的总和,然后从无效元素中移除贡献。最后,我们需要得到平均值本身,除以有效元素的数量。所以,我们会有一个像这样的实现 -
def mean_ignore_num(arr,num):
# Get count of invalid ones
invc = np.count_nonzero(arr==num)
# Get the average value for all numbers and remove contribution from num
return (arr.sum() - invc*num)/float(arr.size-invc)
验证结果 -
In [191]: arr = np.full(10,2147483647).astype(np.int32)
...: arr[1] = 5
...: arr[4] = 4
...:
In [192]: arr.max()
Out[192]: 2147483647
In [193]: arr.sum() # Extends beyond int32 max limit, so no overflow
Out[193]: 17179869185
In [194]: arr[arr != 2147483647].mean()
Out[194]: 4.5
In [195]: mean_ignore_num(arr,2147483647)
Out[195]: 4.5
运行时测试 -
In [38]: arr = np.random.randint(0,9,(10000))
In [39]: arr[arr != 7].mean()
Out[39]: 3.6704609489462414
In [40]: mean_ignore_num(arr,7)
Out[40]: 3.6704609489462414
In [41]: %timeit arr[arr != 7].mean()
10000 loops, best of 3: 102 µs per loop
In [42]: %timeit mean_ignore_num(arr,7)
10000 loops, best of 3: 36.6 µs per loop