2

我有一些如下代码:

import multiprocessing as mp

connection: module.Connection

def client_id():
    for i in range(mp.cpu_count*2):
        yield i

def initproc(host: str, port: int, client_id: int):
    global connection
    connection.connect(host, port, client_id)

def main():
    host = "something"
    port = 12345
    mp.get_context("spawn").Pool(processes=mp.cpu_count()*2,
                                 initializer=initproc,
                                 initargs=(host, port, client_id())) as p:
        res = p.starmap(processing_function, arg_list)
    

就问题而言, processing_function 和 arg_list 不相关。

问题是我得到了一个错误:

    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'generator' object

有没有办法在池中创建一个初始化进程,使得初始化它的参数中的一个是序列中的下一个数字?

PS 在编写的代码中,可以在初始化函数之外初始化所有连接对象,但在我的特定实例中并非如此。我需要将连接参数传递给初始化程序。

4

1 回答 1

2

对于您的情况,一个简单的解决方案是使用包含在Process.name. 你可以用...提取它

mp.current_process().name.split('-')[1]

如果您需要更多地控制序列开始的位置,您可以将multiprocessing.Value其用作工人从中获取唯一编号的计数器。

import multiprocessing as mp
import time


def init_p(client_id):
    with client_id.get_lock():
        globals()['client_id'] = client_id.value
        print(f"{mp.current_process().name},"
              f" {mp.current_process().name.split('-')[1]},"  # alternative
              f" client_id:{globals()['client_id']}")
        client_id.value += 1


if __name__ == "__main__":

    ctx = mp.get_context("spawn")
    client_ids = ctx.Value('i', 0)

    with ctx.Pool(
            processes=4,
            initializer=init_p,
            initargs=(client_ids,)
    ) as pool:

        time.sleep(3)

输出:

SpawnPoolWorker-2, 2, client_id:0
SpawnPoolWorker-3, 3, client_id:1
SpawnPoolWorker-1, 1, client_id:2
SpawnPoolWorker-4, 4, client_id:3

Process finished with exit code 0
于 2020-08-24T16:58:34.047 回答