python - 我遇到了来自 concurrent.futures 的 ProcessPoolExecutor 的问题

Question

我有一个大代码需要一段时间来进行计算，我决定学习多线程和多处理，因为我的处理器只有 20% 用于进行计算。在多线程没有任何改进之后，我决定尝试多处理，每当我尝试使用它时，即使在非常简单的代码上也只会显示很多错误。

这是我的大型计算繁重代码开始出现问题后测试的代码：

from concurrent.futures import ProcessPoolExecutor

def func():
    print("done")

def func_():
    print("done")

def main():
    executor = ProcessPoolExecutor(max_workers=3)

    p1 = executor.submit(func)
    p2 = executor.submit(func_)

main()

在我遇到的错误消息中说

An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

这不是全部信息，因为它非常大，但我认为我可能会有所帮助以帮助我。错误消息中的几乎所有其他内容都类似于“行中的错误 ... in ...”

如果它可能有帮助，大代码位于：https ://github.com/nobody48sheldor/fuseeinator2.0 它可能不是最新版本。

score 1 · Accepted Answer

我更新了您的代码以显示main被调用。这是生成 Windows 等操作系统的问题。为了在我的 linux 机器上进行测试，我不得不添加一些代码。但这在我的机器上崩溃了：

# Test code to make linux spawn like Windows and generate error. This code 
# # is not needed on windows.
if __name__ == "__main__":
    import multiprocessing as mp
    mp.freeze_support()
    mp.set_start_method('spawn')

# test script
from concurrent.futures import ProcessPoolExecutor

def func():
    print("done")

def func_():
    print("done")

def main():
    executor = ProcessPoolExecutor(max_workers=3)
    p1 = executor.submit(func)
    p2 = executor.submit(func_)

main()

在生成系统中，python 不能只是 fork 到一个新的执行上下文中。相反，它运行 python 解释器的一个新实例，导入模块并pickles/unpickles 足够的状态来创建一个子执行环境。这可能是一个非常繁重的操作。

但是您的脚本不是导入安全的。由于main()在模块级别调用，因此子项中的导入将再次运行 main。这将创建一个孙子进程，该子进程再次运行 main （等等，直到你挂起你的机器）。Python 检测到这个无限循环并改为显示消息。

顶级脚本总是被调用"__main__"。将只应在脚本级别运行一次的所有代码放入if. 如果模块被导入，则不会运行任何有害的东西。

if __name__ == "__main__":
    main()

脚本将起作用。

有代码分析器可以导入模块以提取文档字符串或其他有用的东西。您的代码不应仅仅因为某些工具进行了导入而发射导弹。

解决该问题的另一种方法是将所有与多处理相关的内容从脚本中移出并放入模块中。假设我有一个模块，里面有你的代码

随便什么.py

from concurrent.futures import ProcessPoolExecutor

def func():
    print("done")

def func_():
    print("done")

def main():
    executor = ProcessPoolExecutor(max_workers=3)

    p1 = executor.submit(func)
    p2 = executor.submit(func_)

我的脚本.py

#!/usr/bin/env pythnon3
import whatever
whatever.main()

现在，由于池已经在一个导入的模块中，它不会做这个疯狂的重启本身的事情，所以没有if __name__ == "__main__":必要。myscript.py无论如何把它放进去是个好主意，但不是必需的。

python - 我遇到了来自 concurrent.futures 的 ProcessPoolExecutor 的问题

1 回答 1

Related

Reference