我正在编写一个小型多线程 http 文件下载器,并且希望能够在代码遇到错误时缩小可用线程
这些错误将特定于在 Web 服务器不允许任何更多连接的情况下返回的 http 错误
例如。如果我设置了一个由 5 个线程组成的池,则每个线程都试图打开它自己的连接并下载文件的一部分。服务器可能只允许 2 个连接,我相信会返回 503 错误,我想检测到这一点并关闭一个线程,最终将池的大小限制为大概只有服务器允许的 2 个
我可以让线程自行停止吗?
是自我。线程_stop() 足够了吗?
我还需要加入()吗?
这是我的工作类,它进行下载,从队列中抓取进行处理,下载后将结果转储到 resultQ 中,由主线程保存到文件中
在这里,我想检测 http 503 并从可用池中停止/杀死/删除一个线程 - 当然,将失败的块重新添加回队列,以便其余线程将处理它
class Downloader(threading.Thread):
def __init__(self, queue, resultQ, file_name):
threading.Thread.__init__(self)
self.workQ = queue
self.resultQ = resultQ
self.file_name = file_name
def run(self):
while True:
block_num, url, start, length = self.workQ.get()
print 'Starting Queue #: %s' % block_num
print start
print length
#Download the file
self.download_file(url, start, length)
#Tell queue that this task is done
print 'Queue #: %s finished' % block_num
self.workQ.task_done()
def download_file(self, url, start, length):
request = urllib2.Request(url, None, headers)
if length == 0:
return None
request.add_header('Range', 'bytes=%d-%d' % (start, start + length))
while 1:
try:
data = urllib2.urlopen(request)
except urllib2.URLError, u:
print "Connection did not start with", u
else:
break
chunk = ''
block_size = 1024
remaining_blocks = length
while remaining_blocks > 0:
if remaining_blocks >= block_size:
fetch_size = block_size
else:
fetch_size = int(remaining_blocks)
try:
data_block = data.read(fetch_size)
if len(data_block) == 0:
print "Connection: [TESTING]: 0 sized block" + \
" fetched."
if len(data_block) != fetch_size:
print "Connection: len(data_block) != length" + \
", but continuing anyway."
self.run()
return
except socket.timeout, s:
print "Connection timed out with", s
self.run()
return
remaining_blocks -= fetch_size
chunk += data_block
resultQ.put([start, chunk])
下面是我初始化线程池的地方,再往下我将项目放入队列
# create a thread pool and give them a queue
for i in range(num_threads):
t = Downloader(workQ, resultQ, file_name)
t.setDaemon(True)
t.start()