1

我是 python 新手,尝试同时执行两个任务。这些任务只是在 Web 服务器上获取页面,一个可以先于另一个终止。我只想在所有请求都得到处理时才显示结果。在 linux shell 中很容易,但我对 python 毫无用处,而且我读到的所有方法对于像我这样的初学者来说都像是黑魔法。与下面的 bash 脚本的简单性相比,它们在我看来都过于复杂。

这是我想在 python 中模拟的 bash 脚本:

# First request (in background). Result stored in file /tmp/p1
wget -q -O /tmp/p1 "http://ursule/test/test.php?p=1&w=5" &
PID_1=$!

# Second request. Result stored in file /tmp/p2
wget -q -O /tmp/p2 "http://ursule/test/test.php?p=2&w=2"
PID_2=$!

# Wait for the two processes to terminate before displaying the result
wait $PID_1 && wait $PID_2 && cat /tmp/p1 /tmp/p2

test.php脚本简单:

<?php
printf('Process %s (sleep %s) started at %s ', $_GET['p'], $_GET['w'], date("H:i:s"));
sleep($_GET['w']);
printf('finished at %s', date("H:i:s"));
?>

bash 脚本返回以下内容:

$ ./multiThread.sh
Process 1 (sleep 5) started at 15:12:59 finished at 15:12:04
Process 2 (sleep 2) started at 15:12:59 finished at 15:12:01

到目前为止我在 python 3 中尝试过的内容:

#!/usr/bin/python3.2

import urllib.request, threading

def wget (address):
    url = urllib.request.urlopen(address)
    mybytes = url.read()
    mystr = mybytes.decode("latin_1")
    print(mystr)
    url.close()

thread1 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=5",))
thread2 = threading.Thread(None, wget, None, ("http://ursule/test/test.php?p=1&w=2",))

thread1.run()
thread2.run()

这在返回时无法按预期工作:

$ ./c.py 
Process 1 (sleep 5) started at 15:12:58 finished at 15:13:03
Process 1 (sleep 2) started at 15:13:03 finished at 15:13:05
4

2 回答 2

1

与其使用线程,不如使用多处理模块作为独立的每个任务。您可能想阅读有关 GIL 的更多信息 (http://wiki.python.org/moin/GlobalInterpreterLock)。

于 2013-01-01T08:16:23.800 回答
0

根据您的建议,我深入研究了有关多线程和多处理的文档页面,并且在完成了几个基准测试之后,我得出的结论是多处理更适合这项工作。随着线程/进程数量的增加,它可以更好地扩展。我面临的另一个问题是如何存储所有这些过程的结果。使用 Queue.Queue 就可以了。这是我想出的解决方案:

此代码段将并发 http 请求发送到我的测试平台,该请求在发送 anwser 之前暂停一秒钟(请参阅上面的 php 脚本)。

import urllib.request

# function wget arg(queue, adresse)
def wget (resultQueue, address):
    url = urllib.request.urlopen(address)
    mybytes = url.read()
    url.close()
    resultQueue.put(mybytes.decode("latin_1"))

numberOfProcesses = 20

from multiprocessing import Process, Queue

# initialisation
proc = []
results = []
resultQueue = Queue()

# creation of the processes and their result queue
for i in range(numberOfProcesses):
    # The url just passes the process number (p) to the my testing web-server
    proc.append(Process(target=wget, args=(resultQueue, "http://ursule/test/test.php?p="+str(i)+"&w=1",)))
    proc[i].start()

# Wait for a process to terminate and get its result from the queue
for i in range(numberOfProcesses):
    proc[i].join()
    results.append(resultQueue.get())

# display results
for result in results:
    print(result)
于 2013-01-01T20:21:30.487 回答