theano - 将 Theano.scan 与多维数组一起使用

Question

为了加快我的代码速度，我将一个多维 sumproduct 函数从 Python 转换为 Theano。我的 Theano 代码达到了相同的结果，但一次只计算一个维度的结果，因此我必须使用 Python for 循环来获得最终结果。我认为这会使代码变慢，因为 Theano 无法优化内存使用和多个函数调用之间的传输（对于 gpu）。或者这是一个错误的假设？

那么如何更改 Theano 代码，以便在一次函数调用中计算 sumprod？

原始 Python 函数：

def sumprod(a1, a2):
    """Sum the element-wise products of the `a1` and `a2`."""
    result = numpy.zeros_like(a1[0])
    for i, j in zip(a1, a2):
        result += i*j
    return result

对于以下输入

a1 = ([1, 2, 4], [5, 6, 7])
a2 = ([1, 2, 4], [5, 6, 7])

输出将是：[ 26. 40. 65.]即 1*1 + 5*5、2*2 + 6*6 和 4*4 + 7*7

Theano 版本的代码：

import theano
import theano.tensor as T
import numpy

a1 = ([1, 2, 4], [5, 6, 7])
a2 = ([1, 2, 4], [5, 6, 7])

# wanted result:  [ 26.  40.  65.]
# that is 1*1 + 5*5, 2*2 + 6*6 and 4*4 + 7*7

Tk = T.iscalar('Tk')
Ta1_shared = theano.shared(numpy.array(a1).T)
Ta2_shared = theano.shared(numpy.array(a2).T)

outputs_info = T.as_tensor_variable(numpy.asarray(0, 'float64'))

Tsumprod_result, updates = theano.scan(fn=lambda Ta1_shared, Ta2_shared, prior_value: 
                                       prior_value + Ta1_shared * Ta2_shared,
                                       outputs_info=outputs_info,
                                       sequences=[Ta1_shared[Tk], Ta2_shared[Tk]])
Tsumprod_result = Tsumprod_result[-1]

Tsumprod = theano.function([Tk], outputs=Tsumprod_result)

result = numpy.zeros_like(a1[0])
for i in range(len(a1[0])):
    result[i] = Tsumprod(i)
print result

score 7 · Accepted Answer

首先，有更多的人会在 theano 邮件列表上回答你的问题，然后是在 stackoverflow 上。但我在这里:)

首先，您的函数不适合 GPU。即使一切都得到了很好的优化，将输入传输到 gpu 只是为了对结果进行相加和求和，这将比 python 版本花费更多的时间来运行。

您的 python 代码很慢，这是一个应该更快的版本：

def sumprod(a1, a2):
    """Sum the element-wise products of the `a1` and `a2`."""
    a1 = numpy.asarray(a1)
    a2 = numpy.asarray(a2)
    result (a1 * a2).sum(axis=0)
    return result

对于 theano 代码，这里相当于这个更快的 python 版本（无需扫描）

m1 = theano.tensor.matrix()
m2 = theano.tensor.matrix()
f = theano.function([m1, m2], (m1 * m2).sum(axis=0))

从中要记住的想法是您需要“矢量化”您的代码。“vectorize”在 NumPy 上下文中使用，它意味着使用 numpy.ndarray 并使用一次在完整张量上工作的函数。这总是比使用循环（python 循环或 theano 扫描）更快。此外，Theano 通过将计算移到扫描之外来优化其中的一些情况，但它并不总是这样做。

theano - 将 Theano.scan 与多维数组一起使用

1 回答 1

Related

Reference