我正在使用 perl 的Thread::Queue模块来保持线程池忙于为我正在研究的简单爬虫下载 url。使用Thread::Queue
,我将哈希引用列表(准确地说是 360)排入队列,其中每个哈希包含有关单个 url 的信息:
#set up thread queue
my $THREADS=30; # Number of threads
my $url_q = Thread::Queue->new(); # Work to do
my $url_arr = urls();
my $count = 0;
for(@$url_arr) {
print "ENQUEUEING $_->{'url'}\n";
$url_q->enqueue($_);
$count++;
}
print "COUNT $count\n";
print "QUEUE COUNT " . $url_q->pending() . "\n";
threads->create( sub {
while(my $url_h = $url_q->dequeue()) {
print "url: $url_h->{'url'}\n\n";
print "PENDING: " . $url_q->pending() . "\n";
process_url($url_h);
}
}) for (1..$THREADS);
$url_q->end;
print "WAITING\n";
$_->join() for threads->list;
print "DONE WAITING\n";
问题是我看到所有 360 个 url 都被排入队列,但是我只看到挂起的数字下降到 260 左右,这意味着只有大约 100 个真正得到评估,而 260 永远不会这样做?我用 Thread::Queue 做错了什么吗?谢谢!