3

我有以下代码:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: Prelim 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+)  Optional: (.+?)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

当使用第一个 $SourceStr 运行时,它按预期工作。但是,对于被注释掉的第二个,有没有办法用空字符串填充 $4 ?

第一个字符串结果:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[Some stuff here]

第二个字符串结果:不匹配

想:

1=[Rob]
2=[11/2/2011 1:47:30 PM]
3=[3]
4=[]
4

5 回答 5

2

您可以使用更具体的正则表达式:

#!/usr/bin/perl
use warnings;
use strict;

my @SourceStrA=('Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
                'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3');

my $RegEx = qr!Name:\s*(\w+)\s*Time:\s*([\d/]*\s*[\d:]*)\s*State:\s*(\d+)\s*(?:Optional:\s*(.*))?!;

for my $SourceStr (@SourceStrA) {
  print "$SourceStr\n";
  if ($SourceStr =~ m/$RegEx/) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print "4=[$4]\n" if defined $4; 
  }
}

输出:

Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]
4=[Some stuff here]
Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3
1=[Rob]
2=[11/2/2011 13:47:30]
3=[3]
于 2013-01-04T22:19:04.800 回答
1

如此所述,通过命名捕获而不是编号来处理可选匹配可能更容易。

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

foreach (@SourceStr) {
  print "Input '$_'\n";
  if ( /$RegEx/ ) {
     print "Name = '$+{name}'\n";
     print "Time = '$+{time}'\n";
     print "State = '$+{state}'\n";
     print "Optional = '$+{optional}'\n" if $+{optional};
  }
  print "\n";
}

事实上,它变得如此简单,以至于转储%+散列几乎更容易:

#!/usr/bin/env perl

use warnings;
use strict;

my @SourceStr = (
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here',
  'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3',
);

my $RegEx = qr/Name: (?<name>.+?)  Time: (?<time>.+?)  State: (?<state>.+?)(?:  Optional: (?<optional>.+?))?(  |$)/;

use Data::Dumper;
foreach (@SourceStr) {
  print "Input '$_'\n";
  print Dumper \%+ if /$RegEx/;
}
于 2013-01-05T01:54:58.450 回答
1

这是一个产生您想要的结果的选项:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';

#my $SourceStr = 'Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:\s+Optional: (.+))?$/;

if ( $SourceStr =~ $RegEx ) {
    print "1=[$1]\n";
    print "2=[$2]\n";
    print "3=[$3]\n";
    print '4=[' . ( $4 // '' ) . "]\n";
}
于 2013-01-04T23:07:46.743 回答
1

也许你应该使用哈希或其他东西。

#!/usr/bin/perl
use warnings;
use strict;

#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my %Values;

while ($SourceStr =~ m/(\w+): (.+?)(?:  |$)/g) {
    $Values{$1} = $2;
}

if ($Values{Name} && $Values{Time} && $Values{State}) {
    print "1=$Values{Name}\n";
    print "2=$Values{Time}\n";
    print "3=$Values{State}\n";

    if (defined $Values{Optional}) {
        print "4=$Values{Optional}\n";
    } else {
        print "4=\n";
    }
}
于 2013-01-04T22:34:03.703 回答
1

这个请求看起来很奇怪,但这里有一个解决方案:

#!/usr/bin/perl
use warnings;
use strict;

my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3  Optional: Some stuff here';
#my $SourceStr='Foo - Name: Rob  Time: 11/2/2011 13:47:30  State: 3';

my $RegEx = qr/Name: (.+)  Time: (.+)  State: (.+?)(?:  Optional: )?(.*)(  |$)/;

if ($SourceStr =~ m/$RegEx/) {
   print "1=[$1]\n";
   print "2=[$2]\n";
   print "3=[$3]\n";
   print "4=[$4]\n";
}

诀窍当然是在(?: )不改变 $4 位置的情况下使用语法来拥有一个额外的组。此外,使用(?: Optional: (.*))?是不正确的(尽管更合乎逻辑和健壮),因为它意味着 $4 将是未定义的(并且您需要它是一个空字符串),并且use strict编译指示正在打印一条令人不安的Use of uninitialized value...消息。

无论如何,这些要求看起来更像是练习而不是现实生活中的问题,不是吗?

于 2013-01-04T22:48:00.080 回答