python - 如何在 numpy / ipython.parallel 中进行分布式矩阵乘法？

Question

我看到了一个关于如何进行分布式计算的教程：

def parallel_dot(dview, A, B):
    dview.scatter('A', A)
    dview['B'] = B
    dview.execute('C = numpy.dot(A, B)')
    return dview.gather('C')

np.allclose(parallel_dot(dview, A, B),
            np.dot(A, B))

为什么教程使用直接视图？这将如何通过负载平衡视图实现？

我做了一些基准测试，试图弄清楚它的性能如何。

t1 = []
t2 = []
for ii in range(10, 1000, 10):
    A = np.random.rand(10000, ii).astype(np.longdouble).T
    B = np.random.rand(10000, 100).astype(np.longdouble)
    t_ = time.time()
    parallel_dot(dview, A, B).get()
    t1.append(time.time() - t_)
    t_ = time.time()
    np.dot(A, B)
    t2.append(time.time() - t_)
plt.plot( range(10, 1000, 10), t1 )
plt.plot( range(10, 1000, 10), t2 )

结果非常糟糕（蓝色是并行的，绿色是串行的）：

矩阵乘法基准

score 1 · Accepted Answer

这几乎不是一个值得的负担。首先你在做向量乘法，而不是真正的矩阵到矩阵的乘法。试试说，哦 10000x10000 矩阵。如果您有多个内核，我认为您可能会开始看到一些差异。

python - 如何在 numpy / ipython.parallel 中进行分布式矩阵乘法？

1 回答 1

Related

Reference