12

我想让我的类方法并行运行,但它只会产生某种我无法解决的错误。我的代码是:

import concurrent.futures as futures

samples = ['asfd', 'zxcv', 'asf', 'qwer']

class test:
    def __init__(self, samples):
        maturedb = {}
        with futures.ProcessPoolExecutor() as exe:
            for samplename, dResult in exe.map(self.make_readdb, samples):
                maturedb[samplename] = dResult
        print(maturedb)

    def make_readdb(self, samplename):
        return samplename, 1

test(samples)

如果我在 Ubuntu 机器上运行此代码,则会出现如下错误:

Traceback (most recent call last):
    File "/usr/lib/python3.2/multiprocessing/queues.py", line 272, in _feedsend(obj)
    _pickle.PicklingError: Can't pickle <class 'method'>: attribute lookup builtins.method failed

该方法make_readdb只是简化为示例,但它是实际代码中的瓶颈,我需要使其并行。

4

2 回答 2

4

文档

ProcessPoolExecutor 类是 Executor 子类,它使用进程池异步执行调用。ProcessPoolExecutor 使用多处理模块,这允许它绕过全局解释器锁,但也意味着只能执行和返回可提取对象。

试一试ThreadPoolExecutor

我再次查看了您的代码,问题在于该函数 -make_readdb是该类的成员test。你能重构并拉出这个函数吗?

于 2013-07-02T07:36:57.617 回答
0

self应该作为显式参数传递,即使在多个进程中也是如此。像这样:

class test:
    def __init__(self, samples):
        maturedb = {}
        with futures.ProcessPoolExecutor() as exe:
            for samplename, dResult in exe.map(test.make_readdb,self, samples):
                maturedb[samplename] = dResult
        print(maturedb)

    def make_readdb(self, samplename):
        return samplename, 1

但实际上只有一个进程会运行。所以这可能是一个更好的写法: 不要在类中将 self 传递给 ProcessPoolExecutor

class test:
    def __init__(self, samples):
        maturedb = {}
        with futures.ProcessPoolExecutor() as exe:
            for samplename, dResult in exe.map(test.make_readdb, samples):
                maturedb[samplename] = dResult
        print(maturedb)

    @staticmethod
    def make_readdb(samplename):
        return samplename, 1
于 2021-12-20T01:11:07.980 回答