0

I have text file with many entries like this:

[...]
Wind: 83,476,224
Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
EnterValues: 656,136,1
Speed: 48,32
State: 2,102,83,476,224
[...]

From above part I would like to extract:

Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd

It would be simple if EnterValues: exists after every Solution:, unfortunately it doesn't. Sometime it is Speed, sometime something different. I don't know how to construct the end of regex (I assume it should be sth like this:Solution:.*?(?<!~)\n).

My file has \n as a delimiter of new line.

4

3 回答 3

1

正如我所见,您首先将所有文件读入内存,但这不是一个好的做法。尝试使用触发器运算符:

while ( <$fh> ) {
   if ( /Solution:/ ... !/~$/ ) {
      print $_, "\n";
   }
}

我现在无法测试它,但我认为这应该可以正常工作。

于 2013-10-18T16:10:00.383 回答
1

您需要的是应用具有正则表达式功能的“记录分隔符”。不幸的是,您不能使用$/,因为它不能是正则表达式。但是,您可以将整个文件读入一行,并使用正则表达式拆分该行:

use strict;
use warnings;
use Data::Dumper;

my $str = do { 
    local $/;   # disable input record separator
    <DATA>;     # slurp the file
};
my @lines = split /^(?=\pL+:)/m, $str;  # lines begin with letters + colon
print Dumper \@lines;

__DATA__
Wind: 83,476,224
Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
EnterValues: 656,136,1
Speed: 48,32
State: 2,102,83,476,224

输出:

$VAR1 = [
          'Wind: 83,476,224
',
          'Solution: (category,runs)~
0.235,6.52312667,~
0.98962,14.33858333,~
sdasd,cccc,~
0.996052905,sdsd
',
          'EnterValues: 656,136,1
',
          'Speed: 48,32
',
          'State: 2,102,83,476,224
'

我假设您将对这些变量进行某种后处理,但我将把它留给您。从这里开始的一种方法是在换行符上拆分值。

于 2013-10-18T16:10:23.330 回答
0

您可以匹配 fromSolution到单词后跟冒号,

my ($solution) = $text =~ /(Solution:.*?) \w+: /xs;
于 2013-10-18T16:56:09.387 回答