1

在我当前的 theano 脚本中,瓶颈是以下代码:

import numpy as np

axis = 0
prob = np.random.random( ( 1, 1000, 50 ) )
cases = np.random.random( ( 1000, 1000, 50 ) )

start = time.time(  )
for i in xrange( 1000 ):
    result = ( cases * prob ).sum( axis=1-axis, keepdims=True )
print '3D naive method took {} seconds'.format( time.time() - start )
print result.shape
print

我曾在 2D 案例中看到,用点积替换 elementwise+sum 给了我 5 倍的加速。在这种情况下,是否有任何矩阵运算可以帮助我?

编辑

Divakar给了我一个基于einsum的版本。但是,我的意图是将其移植到theano并且einsum在theano上不受支持。因此,欢迎使用可移植到theano的替代方案。

4

1 回答 1

1

我们可以使用np.einsum-

result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]

另一个np.matmul-

result = np.matmul(prob.transpose(2,0,1), cases.T).T

运行时测试 -

In [70]: axis = 0
    ...: prob = np.random.random( ( 1, 1000, 50 ) )
    ...: cases = np.random.random( ( 1000, 1000, 50 ) )
    ...: 

In [71]: out1 = ( cases * prob ).sum( axis=1-axis, keepdims=True )

In [72]: out2 = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]

In [73]: out3 = np.matmul(prob.transpose(2,0,1), cases.T).T

In [74]: np.allclose(out1, out2)
Out[74]: True

In [75]: np.allclose(out1, out3)
Out[75]: True

In [76]: %timeit ( cases * prob ).sum( axis=1-axis, keepdims=True )
10 loops, best of 3: 101 ms per loop

In [77]: %timeit np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]
10 loops, best of 3: 44.1 ms per loop

In [78]: %timeit np.matmul(prob.transpose(2,0,1), cases.T).T
10 loops, best of 3: 44 ms per loop
于 2017-05-05T08:35:56.150 回答