我正在使用 workerpool 模块创建一些线程来对服务器执行 HTTP GETS。到目前为止,我喜欢这个概念。我为 url 列表设置了一个下载作业,然后将工作分配给一个线程池。到目前为止,一切都很好...
import urllib3
import workerpool
from urllib3 import HTTPConnectionPool
headers = { }
headers['user-agent'] = 'Python-httplib2/0.7.4 (gzip)'
headers['accept-encoding'] = 'gzip, deflate'
class Download_Dashlet_Job(workerpool.Job):
"This is a download dashlet Job object for downloading a given URL."
def __init__(self, url):
self.url = url
def run(self):
request = tcp_pool.request('GET', self.url, headers=headers)
#Create a pool of TCP connections for communication to the server (this is using urllib3)
tcp_pool = HTTPConnectionPool('M_Server3', port=8080, timeout=None, maxsize=3, block=True)
# Initialize a pool of three dashlet worker threads to be used for downloading from a page
dashlet_thread_worker_pool = workerpool.WorkerPool(size=3)
# urls.txt is just a text file with 5 urls in it
for url in open("urls.txt"):
job = Download_Dashlet_Job(url.strip())
dashlet_thread_worker_pool.put(job)
# Send shutdown jobs to all dashlet worker threads
dashlet_thread_worker_pool.shutdown()
dashlet_thread_worker_pool.wait()
上面的代码运行良好,但它只有一个线程池。所以,案件是......
单用户打开 Firefox 到 yahoo,浏览器启动多个线程从 yahoo 页面下载组件。
我现在想做的是这个……
十个用户向雅虎打开他们的浏览器,每个用户都有自己的一组(即池)要从雅虎页面下载。
我被困在这一点上。我是否需要为用户创建一个类对象,然后该对象调用对象 Download_Dashlet_Job ?
我试图这样做,但我完全搞砸了。使用 workerpool 应该不会很难,对吗?