2

我打算通过以下步骤生成随机数:

  1. 从文件中读取数据(<DATA>)
  2. 生成与输入数据线一样多的随机数
  3. 随机数不应生成两次,例如,如果在第 x 循环中生成的随机数在此之前已创建,则重新创建随机数。

这是我拥有的导致无限循环的代码。我的逻辑有什么问题,我该如何解决?

#!/usr/bin/perl -w
use strict;
my %chrsize = ('chr1' =>249250621);

# For example case I have created the
# repository where a value has been inserted.
my %done =("chr1    182881372" => 1);

while ( <DATA> ) {
 chomp;
 next if (/^\#/);

 my ($chr,$pos) = split(/\s+/,$_);
 # this number has been generated before
 # with this: int(rand($chrsize{$chr}));
 # hence have to create other than this one
 my $newst =182881372;

 my $newpos = $chr ."\t".$newst;


 # recreate random number
 for (0...10){
     if ( $done{$newpos} ) {

            # INFINITE LOOP
            $newst = int(rand($chrsize{$chr}));
            redo;
     }
 }

 $done{$newpos}=1;
print "$newpos\n";

}


__DATA__
# In reality there are 20M of such lines
# name  positions
chr1    157705682
chr1    19492676
chr1    169660680
chr1    226586538
chr1    182881372
chr1    11246753
chr1    69961084
chr1    180227256
chr1    141449512
4

3 回答 3

3

你有几个错误:

  1. $newst你每次都在你的循环中设置,所以$newpos永远不要接受一个新的值。
  2. 您的内部for循环没有意义,因为$newpos在再次检查条件之前您从未真正更改过。
  3. redo;正在处理内部循环。

这是一个完全避免的更正版本redo

更新:我稍微编辑了算法以使其更简单。

 #!/usr/bin/perl -w
use strict;
my $chr1size = 249250621;

my %done;
my $newst;

while ( <DATA> ) {
    chomp;
    next if (/^\#/);
    my ($chr,$pos) = split(/\s+/,$_);

    my $newpos;
    #This will always run at least once
    do {
        $newst = int(rand($chr1size));
        $newpos = $chr ."\t".$newst;
    } while ( $done{$newpos} );

    $done{$newpos}=1;
    print "$newpos\n";
}

更新 2:虽然上述算法可以工作,但它在 20,000,000 行上会变得非常慢。这是一种应该更快的替代方法(它生成的随机数有一种模式,但在大多数情况下可能没问题)。

#!/usr/bin/perl -w
use strict;
my $newst;

#make sure you have enough.  This is good if you have < 100,000,000 lines.
use List::Util qw/shuffle/;
my @rand_pieces = shuffle (0..10000);

my $pos1   = 0;
my $offset = 1;
while ( <DATA> ) {
    chomp;
    next if (/^\#/);
    my ($chr,$pos) = split(/\s+/,$_);

    $newst = $rand_pieces[$pos1] * 10000 + $rand_pieces[($pos1+$offset)%10000];
    my $newpos = $chr ."\t".$newst;

    $pos1++;
    if ($pos1 > $#rand_pieces) 
    {
        $pos1 = 0;
        $offset = ++$offset % 10000;
        if ($offset == 1) { die "Out of random numbers!"; } 
    }

    print "$newpos\n";
}
于 2012-10-19T08:16:08.600 回答
1

像这样在你的循环中添加一个计数器:

my $counter = 0;
# recrate
for (0...10){
  if ( $done{$newpos} ) {
    # INFINITE LOOP
    $newst = int(rand($chr1size));
    redo if ++$counter < 100; # Safety counter
    # It will continue here if the above doesn't match and run out
    # eventually
  }
}
于 2012-10-19T07:38:04.243 回答
1

要摆脱无限循环,请将redo替换为next

http://www.tizag.com/perlT/perlwhile.php:“重做将再次执行相同的迭代。”

然后你可能需要修复其余的逻辑;)。

于 2012-10-19T07:57:58.867 回答