regex - perl 将一个数组拆分为多个数组

Question

有人可以帮我正确使用 perl 中的 split 函数吗

这是我的名为@input_lines 的输入列表：

google.com/test
yahoo.com/test
##############
somethingelse.com/test
##############
12345

my(@first_array,@second_array,@rand_no) = split(/^\#+/, @input_lines);

score 2 · Accepted Answer

我会猜测你的真正意思：

起初，您可能有一个input.txt带有以下内容的文本输入文件：

 google.com/test
 yahoo.com/test
 ##############
 somethingelse.com/test
 ##############
 12345

现在，您尝试将记录与文件分开，由 14 个“#”分隔。因此，您可以使用##############as读取文件input record separator并完成：

 ...
 my $fn = 'input.txt';             # set the file name
 open my $fh, '<', $fn or die $!;  # open the file
 $/="\n##############\n";          # set the input record separator
 my @parts = <$fh>;                # read the file record-wise
 chomp @parts;                     # remove the record separator from data
 close $fh                         # close the file
 ...

now的元素@parts有以下内容：

 $parts[0]
     google.com/test
     yahoo.com/test

 $parts[1]
     somethingelse.com/test

 $parts[2]
     12345

如果您需要寻找不同大小的 -separators，您可以通过在一次读取操作中slurping文件并随后在分隔符处拆分#来以非常相似的方式实现此目的：

 ...
 my $fn = 'input.txt';
 open my $fh, '<', $fn or die $!;
 undef $/;                           # remove the input record separator
 my @parts = split /\n#+\n/, <$fh>;  # read file as a block and split 
 close $fh;
 ...

结果相同。

问候

rbo

score 1 · Accepted Answer

split对字符串而不是数组进行操作。此外，您不能在同一分配中分配多个数组：右侧的列表被展平，因此第一个数组占据了全部。

更新：此代码有效，但：

my (@first, @second, @rand);

for my $array (\@first, \@second, \@rand) {
    my $line;
    do {
        push @$array, $line = shift @input_lines
    } until $line =~ /^#+/ or ! @input_lines;
    pop @$array if @input_lines;                 # Remove the separators
}

score 1 · Accepted Answer

如果您的@input_lines 字符串的格式相同，您可以类似地加入所有字符串，然后按部分拆分。您应该注意到/^#+/在您的情况下使用 split 是错误的。

my $line = join ',', @input_lines;
my ($first_part, $second_part, $third_part) = split /\#+/, $line;

my @first_array  = split ',', $first_part;
my @second_array = split ',', $second_part;
my @third_array  = split ',', $third_part;

score 1 · Accepted Answer

你可以做这样的事情。有一个数组 ref，其中的每个元素都$output代表您的一个数组。

use strict; use warnings;
use Data::Dumper;

my @input_lines = (
  'google.com/test',
  'yahoo.com/test',
  '##############',
  'somethingelse.com/test',
  '##############',
  '12345',
);

my $output = []; # array ref
my $rand_no;
my $i = 0;
foreach my $line (@input_lines) {
  if ($line =~ m/^#+$/) {
    # if it's the # we move to the next index
    $i++;
    next;
  } 
  elsif ($line =~ m/^\d+$/) {
    # this is the random numer
    $rand_no = $line;
  } else {
    # everything else goes into the current index
    push @{ $output->[$i] }, $line;
  }
} 

print Dumper $output, $rand_no;

输出：

$VAR1 = [
          [
            'google.com/test',
            'yahoo.com/test'
          ],
          [
            'somethingelse.com/test'
          ]
        ];
$VAR2 = '12345';

score 1 · Accepted Answer

假设您的输入行在$string（否则使用join "\n", @input_lines），您可以split像这样使用：

($first, $second, $rand_no) = split /\n#+\n/m, $string;

print "`", $_, "`\n" for (@fields)'

score 1 · Accepted Answer

请参阅下面的两个脚本 - 其中一个应该适合您...

脚本：

my @input_lines = <main::DATA>;
my $input_string = join /\n/, @input_lines; 
my @split_lines = split(/\s*[#\n\r]+\s*/, $input_string);
print "$_\n" for @split_lines;

__DATA__
google.com/test 
yahoo.com/test 
############## 
somethingelse.com/test 
############## 
12345

输出：

google.com/test
yahoo.com/test
somethingelse.com/test
12345

在此处查看并测试代码。

脚本：

 use Data::Dumper;

 my @input_lines = <main::DATA>;
 my $input_string = join /\n/, @input_lines; 
 my @blocks = split(/\s*#+\s*/, $input_string);
 my @matches = ();
 push @matches, [ split(/\s*[\n\r]+\s*/, $_) ] for @blocks;

 print Dumper(@matches);

 __DATA__
 google.com/test 
 yahoo.com/test 
 ############## 
 somethingelse.com/test 
 ############## 
 12345

输出：

 $VAR1 = [
           'google.com/test',
           'yahoo.com/test '
         ];
 $VAR2 = [
           'somethingelse.com/test '
         ];
 $VAR3 = [
           '12345'
         ];

在此处查看并测试代码。

regex - perl 将一个数组拆分为多个数组

6 回答 6

请参阅下面的两个脚本 - 其中一个应该适合您...

Related

Reference