我有一个循环来找到几个总和:
for t in reversed(range(len(inputs))):
dy = np.copy(ps[t])
dy[targets[t]] -= 1
dWhy += np.dot(dy, hs[t].T)
dby += dy
输入值太大,我必须让它平行。所以我把循环转换为一个单独的函数。我尝试使用 ThreadPoolExecutor,但与顺序算法相比,结果时间很慢。
这是我最小的工作示例:
import numpy as np
import concurrent.futures
import time, random
from concurrent.futures import ThreadPoolExecutor
import threading
#parameters
dWhy = np.random.sample(300)
dby = np.random.sample(300)
def Func(ps, targets, hs, t):
global dWhy, dby
dy = np.copy(ps[t])
dWhy += np.dot(dy, hs[t].T)
dby += dy
return dWhy, dby
if __name__ == '__main__':
ps = np.random.sample(100000)
targets = np.random.sample(100000)
hs = np.random.sample(100000)
start = time.time()
for t in range(100000):
dy = np.copy(ps[t])
dWhy += np.dot(dy, hs[t].T)
dby += dy
finish = time.time()
print("One thread: ")
print(finish-start)
dWhy = np.random.sample(300)
dby = np.random.sample(300)
start = time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
args = ((ps, targets, hs, t) for t in range(100000))
for out1, out2 in executor.map(lambda p: Func(*p), args):
dWhy, dby = out1, out2
finish = time.time()
print("Multithreads time: ")
print(finish-start)
在我的 PC 上一个线程时间 ~ 3 秒,多线程时间 ~ 1 分钟。