6

我有 python 脚本run.py

def do(i):
    # doing something with i, that takes time

start_i = sys.argv[1]
end_i = sys.argv[2]
for i in range(start_i, end_i):
    do(i)

然后我运行这个脚本:

python run.py 0 1000000

30分钟后脚本完成。但是,对我来说太长了。

所以,我创建了 bash 脚本run.sh

python run.py 0 200000 &
python run.py 200000 400000 &
python run.py 400000 600000 &
python run.py 600000 800000 &
python run.py 800000 1000000

然后我运行这个脚本:

bash run.sh

6分钟后脚本完成。相当不错。我很高兴。

但我认为,还有另一种解决问题的方法(不创建 bash 脚本),不是吗?

4

2 回答 2

10

您正在寻找多处理包,尤其是Pool类:

from multiprocessing import Pool
p = Pool(5)  # like in your example, running five separate processes
p.map(do, range(start_i, end_i))

Besides consolidating this into a single command, this has other advantages over your approach of calling python run.py 0 200000 & etc. If some processes take longer than others (and therefore, python run.py 0 200000 might finish before the others), this will make sure all 5 threads keep working until all of them are done.

Note that depending on your computer's architecture, running too many processes at the same time might slow them all down (for starters, it depends on how many cores your processor has, as well as what else you are running at the same time).

于 2012-08-26T00:20:07.023 回答
0

您可以让您的 python 程序创建独立的进程,而不是 bash 这样做,但这并没有太大的不同。您认为您的解决方案有什么不足之处?

于 2012-08-26T00:08:05.387 回答