1

我试图寻找答案,但找不到我需要的东西。抱歉,如果这是一个重复的问题。

假设我有一个 shape 的二维数组(n, n*m)。我想要做的是这个数组的外部总和到它的转置,从而产生一个形状为的数组(n*m, n*m)。例如,假设我有

A = array([[1., 1., 2., 2.],
           [1., 1., 2., 2.]])

我想做一个外部总和,A这样A.T输出是:

>>> array([[2., 2., 3., 3.],
           [2., 2., 3., 3.],
           [3., 3., 4., 4.],
           [3., 3., 4., 4.]])

请注意,这np.add.outer不起作用,因为它将输入分解为向量。我可以通过做类似的事情

np.tile(A, (2, 1)) + np.tile(A.T, (1, 2))

但这在相当大(和n)时似乎不合理。是否可以使用 来写这个总和?我只是似乎无法弄清楚。mn > 100m > 1000einsumeinsum

4

2 回答 2

2

为了利用broadcasting,我们需要将其分解为3D然后排列轴并添加 -

n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n) # reshape to 3D
out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)

样品运行验证 -

In [359]: # Setup input array
     ...: np.random.seed(0)
     ...: n,m = 3,4
     ...: A = np.random.randint(1,10,(n,n*m))

In [360]: # Original soln
     ...: out0 = np.tile(A, (m, 1)) + np.tile(A.T, (1, m))

In [361]: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)

In [362]: np.allclose(out0, out)
Out[362]: True

大的时序nm-

In [363]: # Setup input array
     ...: np.random.seed(0)
     ...: n,m = 100,100
     ...: A = np.random.randint(1,10,(n,n*m))

In [364]: %timeit np.tile(A, (m, 1)) + np.tile(A.T, (1, m))
1 loop, best of 3: 407 ms per loop

In [365]: %%timeit
     ...: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: out = (a[None,:,:,:] + a.transpose(1,2,0)[:,:,None,:]).reshape(n*m,-1)
1 loop, best of 3: 219 ms per loop

进一步的性能提升numexpr

我们可以利用multi-core模块numexpr处理大数据并获得内存效率和性能 -

import numexpr as ne

n = A.shape[0]
m = A.shape[1]//n
a = A.reshape(n,m,n)
p1 = a[None,:,:,:]
p2 = a.transpose(1,2,0)[:,:,None,:]
out = ne.evaluate('p1+p2').reshape(n*m,-1)

相同大的时间nm设置 -

In [367]: %%timeit
     ...: # Posted soln
     ...: n = A.shape[0]
     ...: m = A.shape[1]//n
     ...: a = A.reshape(n,m,n)
     ...: p1 = a[None,:,:,:]
     ...: p2 = a.transpose(1,2,0)[:,:,None,:]
     ...: out = ne.evaluate('p1+p2').reshape(n*m,-1)
10 loops, best of 3: 152 ms per loop
于 2018-09-13T21:19:27.507 回答
0

一种方法是

(A.reshape(-1,*A.shape).T+A)[:,0,:]

我认为这将占用大量内存n>100and m>1000

但这不一样吗

np.add.outer(A,A)[:,0,:].reshape(4,-1)
于 2018-09-13T20:42:10.413 回答