我有一个看起来像这样的嵌套列表理解:
>>> nested = [[1, 2], [3, 4, 5]]
>>> [[sqrt(i) for i in j] for j in nested]
[[1.0, 1.4142135623730951], [1.7320508075688772, 2.0, 2.23606797749979]]
是否可以使用标准的 joblib 方法将其并行化,以实现令人尴尬的并行 for 循环?如果是这样,正确的语法是delayed
什么?
据我所知,文档没有提到或给出任何嵌套输入的例子。我尝试了一些天真的实现,但无济于事:
>>> #this syntax fails:
>>> Parallel(n_jobs = 2) (delayed(sqrt)(i for i in j) for j in nested)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\joblib\parallel.py", line 660, in __call__
self.retrieve()
File "C:\Python27\lib\site-packages\joblib\parallel.py", line 512, in retrieve
self._output.append(job.get())
File "C:\Python27\lib\multiprocessing\pool.py", line 558, in get
raise self._value
pickle.PicklingError: Can't pickle <type 'generator'>: it's not found as __builtin__.generator
>>> #this syntax doesn't fail, but gives the wrong output:
>>> Parallel(n_jobs = 2) (delayed(sqrt)(i) for i in j for j in nested)
[1.7320508075688772, 1.7320508075688772, 2.0, 2.0, 2.23606797749979, 2.23606797749979]
如果这是不可能的,我显然可以在将列表传递给Parallel
. 但是,我的实际清单很长,而且每个项目都很大,所以这样做并不理想。