3

我是perl的新手,所以请原谅我的无知。(我使用的是 Windows 7)

我借用了 echicken 的线程示例脚本,并想用它作为脚本的基础来进行许多系统调用,但我遇到了一个超出我理解的问题。为了说明我看到的问题,我在下面的示例代码中执行了一个简单的 ping 命令。

  • $nb_process是允许的数量或同时运行的线程。
  • $nb_compute作为我们想要运行子例程的次数(即我们将发出 ping 命令的总次数)。

当我将$nb_compute和设置$nb_process为彼此相同的值时,它可以完美运行。

但是,当我减少$nb_process(以限制任何时候运行的线程数)时,一旦定义的线程数$nb_process开始,它似乎就会锁定。

如果我删除系统调用(ping 命令),它工作正常。

我看到其他系统调用的行为相同(它不仅仅是 ping)。

请问有人可以帮忙吗?我提供了下面的脚本。

#!/opt/local/bin/perl -w  
 use threads;  
 use strict;  
 use warnings;  

 my @a = ();  
 my @b = ();  


 sub sleeping_sub ( $ $ $ ); 

 print "Starting main program\n";  

 my $nb_process = 3;  
 my $nb_compute = 6;  
 my $i=0;  
 my @running = ();  
 my @Threads;  
 while (scalar @Threads < $nb_compute) {  

     @running = threads->list(threads::running);  
     print "LOOP $i\n";  
     print "  - BEGIN LOOP >> NB running threads = ".(scalar @running)."\n";  

     if (scalar @running < $nb_process) {  
         my $thread = threads->new( sub { sleeping_sub($i, \@a, \@b) });  
         push (@Threads, $thread);  
         my $tid = $thread->tid;  
         print "  - starting thread $tid\n";  
     }  
     @running = threads->list(threads::running);  
     print "  - AFTER STARTING >> NB running Threads = ".(scalar @running)."\n";  
     foreach my $thr (@Threads) {  
         if ($thr->is_running()) {  
             my $tid = $thr->tid;  
             print "  - Thread $tid running\n";  
         }  
         elsif ($thr->is_joinable()) {  
             my $tid = $thr->tid;  
             $thr->join;  
             print "  - Results for thread $tid:\n";  
             print "  - Thread $tid has been joined\n";  
         }  
     }  

     @running = threads->list(threads::running);  
     print "  - END LOOP >> NB Threads = ".(scalar @running)."\n";  
     $i++;  
 }  

 print "\nJOINING pending threads\n";  
 while (scalar @running != 0) {  
    foreach my $thr (@Threads) {  
         $thr->join if ($thr->is_joinable());  
     }  
     @running = threads->list(threads::running);  
}  
 print "NB started threads = ".(scalar @Threads)."\n";  
 print "End of main program\n";  


 sub sleeping_sub ( $ $ $ ) { 
    my @res2 = `ping 136.13.221.34`; 
    print "\n@res2";
    sleep(3);  
 } 
4

1 回答 1

3

您的程序的主要问题是您有一个繁忙的循环来测试是否可以加入线程。这是浪费。此外,您可以减少全局变量的数量以更好地理解您的代码。

其他挑眉:

  • 永远不要使用原型,除非你确切地知道它们的意思。
  • sleeping_sub不使用它的任何参数。
  • 你经常使用这个threads::running列表,却没有考虑这是否真的正确。

似乎您只想一次运行N个工作人员,但想总共启动M个工作人员。这是一个相当优雅的实现方式。主要思想是我们在线程之间有一个队列,刚刚完成的线程可以将它们的线程 ID 加入队列。然后将加入该线程。为了限制线程数,我们使用信号量:

use threads; use strict; use warnings;
use feature 'say';  # "say" works like "print", but appends newline.
use Thread::Queue;
use Thread::Semaphore;

my @pieces_of_work = 1..6;
my $num_threads = 3;
my $finished_threads = Thread::Queue->new;
my $semaphore = Thread::Semaphore->new($num_threads);

for my $task (@pieces_of_work) {
  $semaphore->down;  # wait for permission to launch a thread

  say "Starting a new thread...";

  # create a new thread in scalar context
  threads->new({ scalar => 1 }, sub {
    my $result = worker($task);                # run actual task
    $finished_threads->enqueue(threads->tid);  # report as joinable "in a second"
    $semaphore->up;                            # allow another thread to be launched
    return $result;
  });

  # maybe join some threads
  while (defined( my $thr_id = $finished_threads->dequeue_nb )) {
    join_thread($thr_id);
  }
}

# wait for all threads to be finished, by "down"ing the semaphore:
$semaphore->down for 1..$num_threads;
# end the finished thread ID queue:
$finished_threads->enqueue(undef);

# join any threads that are left:
while (defined( my $thr_id = $finished_threads->dequeue )) {
  join_thread($thr_id);
}

和定义join_threadworker

sub worker {
  my ($task) = @_;
  sleep rand 2; # sleep random amount of time
  return $task + rand; # return some number
}

sub join_thread {
  my ($tid) = @_;
  my $thr = threads->object($tid);
  my $result = $thr->join;
  say "Thread #$tid returned $result";
}

我们可以得到输出:

Starting a new thread...
Starting a new thread...
Starting a new thread...
Starting a new thread...
Thread #3 returned 3.05652608754778
Starting a new thread...
Thread #1 returned 1.64777186731541
Thread #2 returned 2.18426146087901
Starting a new thread...
Thread #4 returned 4.59414651998983
Thread #6 returned 6.99852684265667
Thread #5 returned 5.2316971836585

(顺序和返回值不是确定性的)。

队列的使用可以很容易地判断哪个线程已经完成。信号量使保护资源或限制并行事物的数量变得更加容易。

与繁忙的循环相比,这种模式的主要好处是使用的 CPU 少得多。这也缩短了一般执行时间。

虽然这是一个很大的改进,但我们可以做得更好!产生线程是昂贵的:这基本上是fork()在 Unix 系统上没有所有的写时复制优化。复制整个解释器,包括您已经创建的所有变量、所有状态等

因此,应谨慎使用线程,并尽早生成线程。我已经向您介绍了可以在线程之间传递值的队列。我们可以扩展它,以便一些工作线程不断地从输入队列中提取工作,并通过输出队列返回。现在的困难是让最后一个退出的线程完成输出队列。

use threads; use strict; use warnings;
use feature 'say';
use Thread::Queue;
use Thread::Semaphore;

# define I/O queues
my $input_q  = Thread::Queue->new;
my $output_q = Thread::Queue->new;

# spawn the workers
my $num_threads = 3;
my $all_finished_s = Thread::Semaphore->new(1 - $num_threads); # a negative start value!
my @workers;
for (1 .. $num_threads) {
  push @workers, threads->new( { scalar => 1 }, sub {
    while (defined( my $task = $input_q->dequeue )) {
      my $result = worker($task);
      $output_q->enqueue([$task, $result]);
    }
    # we get here when the input queue is exhausted.
    $all_finished_s->up;
    # end the output queue if we are the last thread (the semaphore is > 0).
    if ($all_finished_s->down_nb) {
      $output_q->enqueue(undef);
    }
  });
}

# fill the input queue with tasks
my @pieces_of_work = 1 .. 6;
$input_q->enqueue($_) for @pieces_of_work;

# finish the input queue
$input_q->enqueue(undef) for 1 .. $num_threads;

# do something with the data
while (defined( my $result = $output_q->dequeue )) {
  my ($task, $answer) = @$result;
  say "Task $task produced $answer";
}

# join the workers:
$_->join for @workers;

worker如前所述,我们得到:

Task 1 produced 1.15207098293783
Task 4 produced 4.31247785766295
Task 5 produced 5.96967474718984
Task 6 produced 6.2695013168678
Task 2 produced 2.02545636412421
Task 3 produced 3.22281619053999

(打印完所有输出后,三个线程将加入,因此输出会很无聊)。

当我们detach使用线程时,第二个解决方案会变得更简单——主线程在所有线程退出之前不会退出,因为它正在侦听由最后一个线程完成的输入队列。

于 2013-08-16T13:04:40.443 回答