python - numpy中带有数组索引的嵌套循环

Question

我想知道如何执行以下操作：

def GetFlux(self, time):
    bx = self.GetField("bx", time) * self.wpewce
    by = self.GetField("by", time) * self.wpewce
    bz = self.GetField("bz", time) * self.wpewce              

    flux  = np.zeros((self.ncells[0]+1,self.ncells[1]+1),"float32", order='FORTRAN')
    flux2  = np.zeros((self.ncells[0]+1,self.ncells[1]+1),"float32", order='FORTRAN')

    dx = self.dl[0]
    dz = self.dl[1]

    nx = self.ncells[0]
    nz = self.ncells[1]

    j = 0

    for i in np.arange(1, nx):
        flux2[i,0] = flux2[i-1,0] + bz[i-1,0]*dx
    flux[1:,0] = flux[0,0] + np.cumsum(bz[:-1,0]*dx)

    for j in np.arange(1,nz):
        flux2[0,j] = flux2[0,j-1] - bx[0,j-1]*dz
    flux[0,1:] = flux[0,0] - np.cumsum(bx[0,:-1]*dz)

    for i in np.arange(1,nx):
        for j in np.arange(1,nz):
            flux2[i,j] = 0.5*(flux2[i-1,j] + bz[i-1,j]*dx) + 0.5*(flux2[i,j-1] - bx[i,j-1]*dz)

    return flux2

但是没有两个嵌套循环，这需要很长时间。Bx,Bz和flux是相同大小的数组。

我已经设法用数组索引和 cumsum 替换了前两个单循环，但我不知道如何替换嵌套循环。

任何的想法？

谢谢

score 1 · Accepted Answer

内循环矢量化相当简单。你有一个看起来像这样的基本方程：

X[n] = a * X[n-1] + b[n]

这个方程可以扩展和重写，而不依赖于 X[n-1]：

X[n] = a^n * X[0] + a^(n-1) * b[0] + a^(n-2) * b[1] + ... + a^0 * b[n]

因此，如果原始代码如下所示：

for i in np.arange(1,nx+1):
    for j in np.arange(1,nz+1):
        flux2[i,j] = 0.5*(flux2[i-1,j] + bz[i-1,j]*dx) \
                   + 0.5*(flux2[i,j-1] - bx[i,j-1]*dz)

你可以像这样摆脱内部循环：

a = 0.5
aexp = np.arange(nz).reshape(nz, 1) - np.arange(nz).reshape(1, nz)
abcoeff = a**aexp
abcoeff[aexp<0] = 0
for i in np.arange(1,nx+1):
    b = 0.5*flux2[i-1, 1:] + 0.5*bz[i-1, 1:]*dx - 0.5*bx[i,:-1]*dz
    bvals = (abcoeff * b.reshape(1, nz)).sum(axis=1)
    n = np.arange(1, nz+1)
    x0 = flux2[i, 0]
    flux2[i, 1:] = a**n * x0 + bvals

由于浮点错误，这些值不会完全相同，但足够接近。我想理论上你可以应用相同的过程来摆脱两个循环，但它会变得非常复杂，并且根据你的数组的形状，可能不会提供太多的性能优势。

score 1 · Accepted Answer

有可能（ab）使用 scipy.ndimage.convolve 来解决这类问题。也许在 scipy 中使用一些过滤器方法也可以工作并且更好，因为它不依赖 scipy.ndimage.convolve 就地工作（我可以想象在遥远的将来会发生这种变化）。（编辑：首先写 scipy.signal.convolve 就像 numpy.convolve 一样，不能这样做）

诀窍是这个 convolve 函数可以就地使用，所以双 for 循环：

for i in xrange(1, flux.shape[0]):
    for j in xrange(1, flux.shape[1]):
        flux[i,j] = 0.5*(flux[i-1,j] + bz[i-1,j]*dx) + 0.5*(flux[i,j-1] - bx[i,j-1]*dz)

可以替换为（对不起，需要这么多临时数组...）：

from scipy.ndimage import convolve
_flux = np.zeros((flux.shape[0]+1, flux.shape[1]+1), dtype=flux.dtype)
temp_bx = np.zeros((bx.shape[0]+1, bx.shape[1]+1), dtype=bx.dtype)
temp_bz = np.zeros((bz.shape[0]+1, bz.shape[1]+1), dtype=bz.dtype)

_flux[:-1,:-1] = flux
convolve(_flux[:-1,:-1], [[0, 0.5], [0.5, 0]], _flux[1:,1:])

temp_bz[1:,1:-1] = bz[:,1:]*dx
temp_bx[1:-1,1:] = bx[1:,:]*dz

conv_b = np.array([[0.0, 0.5], [0.5, 0.5]])
convolve(temp_bz[:-1,:-1], [[0.5, 0.5], [0.5, 0.]], temp_bz[1:,1:])
convolve(temp_bx[:-1,:-1], [[-0.5, 0.5], [0.5, 0.]], temp_bx[1:,1:])

flux = _flux[:-1,:-1] + temp_by[:-1,:-1] + temp_bx[:-1,:-1]

不幸的是，这意味着我们需要弄清楚 bx, bz 如何进入最终结果，但是这种方法避免了产生大的 2 次幂，并且应该比之前的答案明显更快。

（请注意，numpy convolve 函数不允许这种就地使用。）

score 0 · Accepted Answer

使用convolve的方法很好，但是stencil做的方式不明显……如果这条线

flux[i,j] = 0.5*(flux[i-1,j] + bz[i-1,j]*dx) + 0.5*(flux[i,j-1] - bx[i,j-1]*dz)

被替换为

flux[i,j] = a*flux[i-1,j] + b*bz[i-1,j] + c*flux[i,j-1] - d*bx[i,j-1]

我认为第一个卷积（在_flux上）将使用模板[[0，a]，[b，0]]。但是，假设 a、b、c 和 d 是标量，那么 bz 和 bx 的另外 2 个模板是什么？

python - numpy中带有数组索引的嵌套循环

3 回答 3

Related

Reference