我正在制作一个浓缩的(仅右上角)距离矩阵。距离的计算需要一些时间,所以我想并行化 for 循环。unparelalised 循环看起来像
spectra_names, condensed_distance_matrix, index_0 = [], [], 0
for index_1, index_2 in itertools.combinations(range(len(clusters)), 2):
if index_0 == index_1:
index_0 += 1
spectra_names.append(clusters[index_1].get_names()[0])
try:
distance = 1/float(compare_clusters(clusters[index_1], clusters[index_2],maxiter=50))
except:
distance = 10
condensed_distance_matrix.append(distance)
其中 clusters 是要比较的对象列表,compare_clusters()
是似然函数,1/compare_clusters()
是两个对象之间的距离。
我试图通过像这样将距离函数移出循环来使其并行化
from multiprocessing import Pool
condensed_distance_matrix = []
spectra_names = []
index_0 = 0
clusters_1 = []
clusters_2 = []
for index_1, index_2 in itertools.combinations(range(len(clusters)), 2):
if index_0 == index_1:
index_0 += 1
spectra_names.append(clusters[index_1].get_names()[0])
clusters_1.append(clusters[index_1])
clusters_2.append(clusters[index_2])
pool = Pool()
condensed_distance_matrix_values = pool.map(compare_clusters, clusters_1, clusters_2)
for value in condensed_distance_matrix_values :
try:
distance = 1/float(value)
except:
distance = 10
condensed_distance_matrix.append(distance)
在并行化之前,我尝试了相同的代码,但使用map()
而不是pool.map()
. 这如我所愿。但是,使用时pool.map()
出现错误
File "C:\Python27\lib\multiprocessing\pool.py", line 225, in map
return self.map_async(func, iterable, chunksize).get()
File "C:\Python27\lib\multiprocessing\pool.py", line 288, in map_async
result = MapResult(self._cache, chunksize, len(iterable), callback)
File "C:\Python27\lib\multiprocessing\pool.py", line 551, in __init__
self._number_left = length//chunksize + bool(length % chunksize)
TypeError: unsupported operand type(s) for //: 'int' and 'list'
我在这里想念什么?