3

请帮助我使我的脚本成为多线程的。我已阅读该threads::shared模块的文档,但对理解如何做没有帮助

    use threads;
use threads::shared;
use LWP::UserAgent;
use HTTP::Cookies;

my $NUM_WORKERS = 2;

sub worker {
   my ($i) = @_;
   my ($web, $ck) = browser();
    ($username, $password) = split ':', $acc;
    my $url = 'http://www.site.ru/?tkn'. int(rand(10000));
    my $response = $web->post($url, Content =>
                    [//////]);
    while(1)
    {
        my $url = 'http://www.site.ru/dk?st.page='.$i.'&st.name=%D0%B0';
        my $response  = $web->get($url);
        @list = ($response->content =~ /card_wrp"><div class="photoWrapper"><div><a href="\/(.*?)\?/g);
        @popl = ($response->content =~/<\/div><div class="info">(.*?)<\/div>/g);

        for ($j = 0; $j <= scalar @list - 1; $j++)
        {
            $popl[$j] =~ s/&nbsp;//g;
            open F, ">>gr.txt";
            print F $list[$j].':'.$popl[$j]. "\n";
            close F;
        }
        print "[+] Page $i \n";

    }
}

my $i :shared = 1;
my $last = 79265;
my @workers;
for (1..$NUM_WORKERS) {
   push @workers, async {
      while (1) {
         my $i;
         {
            lock $I;
            return if $i == $last;
            $i = ++$I;
         }
         worker($i);
      }
   };
}

$_->join() for @workers;
sub browser 
{
 my $web = new LWP::UserAgent;
 my $ck = new HTTP::Cookies;
    $web->cookie_jar($ck);
    $web->agent('Opera/9.80 (Windows 7; U; en) Presto/2.9.168 Version/11.50');
    $web->requests_redirectable(0); 

    $web->timeout(5);
 return $web, $ck;
}
sub loadf {
    open (F, "<".$_[0]) or erroropen($_[0]); 
    chomp(my @data = <F>);
    close F;
    return @data;
}

我不明白我需要分享什么变量。非常感谢所有帮助我的成员

4

1 回答 1

8

如果您没有线程,则工作循环看起来像

for my $i (1..79265) {
   worker($i);
}

问题是该变量是不可共享的,并且 for维护着一个无法共享的内部状态,因此我们需要将其重写为没有这些问题的东西。

选项1:

my @a = 1..79265;
while (@a) {
   worker(shift(@a));
}

选项 2:

my $i = 0;
while (++$i <= 79265) {
   worker($i);
}

并行化任一版本所需要做的就是确保@a/$i在您检查它和使用它的时间之间不会发生变化。这是通过添加锁来完成的。

选项1:

my @a :shared = 1..79265;
while (1) {
   my $i;
   { lock @a; return if !@a; $i = shift(@a); }
   worker($i);
}

选项 2:

my $I :shared = 1;
while (1) {
   my $i;
   { lock $I; $i = $I; return if ++$I > 79265; }
   worker($i);
}

选项 1 是后面的 Thread::Queue 解决方案的基础(尽管它可以按原样使用)。选项 2 在下面按原样使用。


我通常使用Thread::QueueThread::Queue::Any

use threads;
use Thread::Queue qw( );

my $NUM_WORKERS = 5;

sub worker {
   my ($i) = @_;
   ... put your download code here ...
}

my $q = Thread::Queue->new();
my @workers;
for (1..$NUM_WORKERS) {
   push @workers, async {
      while (defined(my $i = $q->dequeue())) {
         worker($i);
      }
   };
}

$q->enqueue($_) for 1..79265;
$q->enqueue(undef) for @workers;
$_->join() for @workers;

但是我们可以很容易地做到这一点:

use threads;
use threads::shared;

my $NUM_WORKERS = 5;

sub worker {
   my ($i) = @_;
   ... put your download code here ...
}

my $I :shared = 1;
my $last = 79265;
my @workers;
for (1..$NUM_WORKERS) {
   push @workers, async {
      while (1) {
         my $i;
         {
            lock $I;
            $i = $I;
            return if ++$I > $last;
         }
         worker($i);
      }
   };
}

$_->join() for @workers;
于 2012-09-09T20:05:51.163 回答