python - 动态更改并行线程数

Question

我有一个程序需要通过网络进行大量查询，所以我正在做的是并行化工作。它实际上是 I/O 受限的，我只是在做：

for i in range(options.workers):
    w = Worker(queue, output_queue, options.site)
    w.setDaemon(True)
    w.start()

for i, dataset_metadata in enumerate(datasets_metadata):
    queue.put((i+1, dataset_metadata))

queue.join()

the options.workers comes from the command line. Now I want to dynamically change the number of works.

First question: how to add workers after queue.join?

Second question: how to evaluate the optimal number of workers at run time? I think I've to monitor the speed tasks/time, increase the number of workers until this ratio doesn't change.

score 1 · Accepted Answer

您可能可以自己启动和停止您的工作人员，但您需要的大部分功能可能已经可用：

该multiprocessing.dummy模块导出与相同的 API multithreading，仅使用线程而不是进程实现。
这意味着您可以使用Pool已经实现的工作程序，并且如果在某些时候需要，它可以很容易地从线程切换到多处理。
concurrent.futuresAPI 提供了更高级的并发模型。它在 python3.2+ 的标准库中，但有早期版本的反向移植。

python - 动态更改并行线程数

1 回答 1

Related

Reference