9

我有一个数组,我想通过扫描 2x2 非重叠窗口并获得最大值来生成一个较小的数组。这是一个例子:

import numpy as np

np.random.seed(123)
np.set_printoptions(linewidth=1000,precision=3)
arr = np.random.uniform(-1,1,(4,4))
res = np.zeros((2,2))
for i in xrange(res.shape[0]):
    for j in xrange(res.shape[1]):
        ii = i*2
        jj = j*2
        res[i][j] = max(arr[ii][jj],arr[ii+1][jj],arr[ii][jj+1],arr[ii+1][jj+1])

print arr
print res

所以像这样的矩阵:

[[ 0.393 -0.428 -0.546  0.103]
 [ 0.439 -0.154  0.962  0.37 ]
 [-0.038 -0.216 -0.314  0.458]
 [-0.123 -0.881 -0.204  0.476]]

应该变成这样:

[[ 0.439  0.962]
 [-0.038  0.476]]    

我怎样才能更有效地做到这一点?

4

2 回答 2

9

You can do this:

print arr.reshape(2,2,2,2).swapaxes(1,2).reshape(2,2,4).max(axis=-1)

[[ 0.439  0.962]
 [-0.038  0.476]]

To explain starting with:

arr=np.array([[0.393,-0.428,-0.546,0.103],
[0.439,-0.154,0.962,0.37,],
[-0.038,-0.216,-0.314,0.458],
[-0.123,-0.881,-0.204,0.476]])

We first want to group the axes into relevant sections.

tmp = arr.reshape(2,2,2,2).swapaxes(1,2)
print tmp    

[[[[ 0.393 -0.428]
   [ 0.439 -0.154]]

  [[-0.546  0.103]
   [ 0.962  0.37 ]]]


 [[[-0.038 -0.216]
   [-0.123 -0.881]]

  [[-0.314  0.458]
   [-0.204  0.476]]]]

Reshape once more to obtain the groups of data we want:

tmp = tmp.reshape(2,2,4)
print tmp

[[[ 0.393 -0.428  0.439 -0.154]
  [-0.546  0.103  0.962  0.37 ]]

 [[-0.038 -0.216 -0.123 -0.881]
  [-0.314  0.458 -0.204  0.476]]]

Finally take the max along the last axis.

This can be generalized, for square matrices, to:

k = arr.shape[0]/2
arr.reshape(k,2,k,2).swapaxes(1,2).reshape(k,k,4).max(axis=-1)

Following the comments of Jamie and Dougal we can generalize this further:

n = 2                   #Height of window
m = 2                   #Width of window
k = arr.shape[0] / n    #Must divide evenly
l = arr.shape[1] / m    #Must divide evenly
arr.reshape(k,n,l,m).max(axis=(-1,-3))              #Numpy >= 1.7.1
arr.reshape(k,n,l,m).max(axis=-3).max(axis=-1)      #Numpy <  1.7.1
于 2013-09-05T20:18:43.733 回答
2

正如我在评论区提到的,考虑使用 NumBa。你可以让你的双循环保持原样,在装饰器中添加大约 10 个字符,并为此获得类似 C 的性能。如果您使用 Continuum Analytics 的“Anaconda”Python 发行版,开箱即用即可轻松使用。

这几乎是 NumBa 的完美用例,因为这种算法用双循环更自然地表达。重塑方法利用了快速数组操作,但除非您已经知道程序的目标,否则它非常不可读。非常希望将这样的函数保留在扩展形式中,并通过让其他东西在事后转换为低级语言来实现速度。

于 2013-09-05T20:29:29.670 回答