regex - Perl：将新行转换为在 2 个重复单词之间分隔的逗号

Question

下面是带有重复行的单列的输出，该行可以是正则表达式/拆分等的一部分。

我想将分组列转换为逗号分隔格式。有人可以帮我弄这个吗？

前：

An instance of HostInfo
1=?  
2=?   
3=?    
4=?  
5=?
An instance of HostInfo
1=?
2=?
3=?
4=?
5=?

后

1, 1=?, 2=?, 3=?, 4=?, 5=?

2, 1=?, 2=?, 3=?, 4=?, 5=?

score 3 · Accepted Answer

应该记住， Perl 中的行处理是记录处理的一个实例。您可以将记录分隔符设置为适合您的数据的内容。

假设文件确实包含字符串“HostInfo 的实例”，您可以执行以下操作。

您还可以设置记录分隔符：

use English qw<$RS>;
my $old_rs = $RS;
local $RS = "An instance of HostInfo\n";

然后您可以读取这些块中的文件。

while ( <$input> ) { 
    chomp; # removes record separator
    next unless $_;
    ...
}

然后，您可以将记录分成几行并用逗号重新加入它们。所以...是：

say join( ', ', split $old_rs );

score 0 · Accepted Answer

尝试这样做：

use strict; use warnings;

my ($count, $hash);

# magic diamond operator to read INPUT
while (<>) {
    # removes newlines
    chomp;
    # if the line contains /An instance/
    # incrementing $count and skipping this line
    do{ $count++; next } if /An instance/;
    # else add current line in a reference to an array
    push @{ $hash->{$count} }, $_;
}

# iterating over "instances"
foreach my $first_level (sort keys %$hash) {
    # finally we print the result by de-referencing the HASH ref
    print "$first_level ", join ", ", @{ $hash->{$first_level} }, "\n";
}

用法：

perl script.pl < input_file.txt

score 0 · Accepted Answer

这样的事情会起作用吗？

use strict;
use warnings;

undef $/;

my $output = <DATA>;

my @parts = split /An instance of HostInfo/m, $output;

my $ctr = 1;
for my $part (@parts) {
  my @lines = split "\n", $part;
  @lines = grep {$_} @lines;
  next unless @lines;
  s/^\s+//g for @lines;
  s/\s+$//g for @lines;
  print $ctr++, ', ', join(", ", @lines),"\n";
}

__DATA__
An instance of HostInfo
1=?  
2=?   
3=?    
4=?  
5=?
An instance of HostInfo
1=?
2=?
3=?
4=?
5=?

这会将您的示例输出读入单个字符串，将其拆分为“HostInfo 的实例”，然后循环遍历每个段，拆分行，修剪它们，最后将它们重新连接在一起。

regex - Perl：将新行转换为在 2 个重复单词之间分隔的逗号

3 回答 3

Related

Reference