regex - 为什么这个正则表达式不会从 Pod::Usage 文本中删除最后的空格？

Question

我正在开发一个模块，该模块依赖Pod::Usage来解析调用脚本的 POD，然后将用法、帮助和手册文本发送到标量变量。我需要从该文本中删除最后的空格，所以我使用了一个我认为可行的简单正则表达式。它确实......但间歇性地。

这是问题的演示。任何见解将不胜感激。

在我的装有 Perl 5.10.1 的 Solaris 机器上，意外的行为（即，正则表达式未能删除最后的换行符）始终如一地发生。在带有 Perl 5.12.1 的 Windows 下，该行为是不稳定的（下面提供了输出）。

use strict;
use warnings;

use Pod::Usage qw(pod2usage);
use Test::More;

# Baseline test to show that the regex works.
my $exp                      = "foo\nbar\n...";
my $with_trailing_whitespace = $exp . "   \n\n";
$with_trailing_whitespace    =~ s!\s+\Z!!;
my $ords = get_ords_of_final_chars($with_trailing_whitespace);
is_deeply $ords, [46, 46, 46]; # String ends with 3 periods (not whitespace).

# Run a similar test, using text from Pod::Usage.
for (1 .. 2){
    my $pod = get_pod_text();
    $ords = get_ords_of_final_chars($pod);
    is_deeply $ords, [46, 46, 46];
}

done_testing();

sub get_ords_of_final_chars {
    # Takes a string. Return array ref of the ord() of last 3 characters.
    my $s = shift;
    return [ map ord(substr $s, - $_, 1), 1 .. 3 ];
}

sub get_pod_text {
    # Call pod2usage(), sending output to a scalar.
    open(my $fh, '>', \my $txt) or die $!;
    pod2usage(-verbose => 2, -exitval => 'NOEXIT', -output  => $fh);
    close $fh;   # This doesn't help.

    # Here's the same regex as above.
    # 
    # If I use chomp(), the newlines are consistently removed:
    #     1 while chomp($txt);
    $txt =~ s!\s+\Z!!;
    return $txt; 
}

__END__

=head1 NAME

sample - Some script...

=head1 SYNOPSIS

foo.pl ARGS...

=head1 DESCRIPTION

This program will read the given input file(s) and do something
useful with the contents thereof...

=cut

我的 Windows 盒子上的输出：

$ perl  demo.pl
ok 1
not ok 2
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
not ok 3
#   Failed test at demo.pl line 18.
#     Structures begin differing at:
#          $got->[0] = '10'
#     $expected->[0] = '46'
1..3
# Looks like you failed 2 tests of 3.

$ perl  demo.pl
ok 1
ok 2
ok 3
1..3

score 4 · Accepted Answer

好吧，引用perlre：

\Z Match only at end of string, or before newline at the end
\z Match only at end of string

所以，你应该使用$txt =~ s!\s+\z!!;（小写z）。

虽然，因为\s+是贪婪的，我本来希望它无论如何都能工作。也许这是一个 Perl 错误。

score 1 · Accepted Answer

虽然其他海报关于 \z\Z$ 是正确的，但我在 win32 上没有遇到任何故障

$ perl -d:Modlist demo.pl
ok 1
ok 2
ok 3
1..3
Carp                   1.17
Config
Encode                 2.43
Encode::Alias          2.14
Encode::Config         2.05
Encode::Encoding       2.05
Exporter             5.64_01
Exporter::Heavy      5.64_01
File::Spec             3.33
File::Spec::Unix       3.33
File::Spec::Win32      3.33
PerlIO                 1.06
PerlIO::scalar         0.08
Pod::Escapes           1.04
Pod::InputObjects      1.31
Pod::Parser            1.37
Pod::Select            1.36
Pod::Simple            3.16
Pod::Simple::BlackBox   3.16
Pod::Simple::LinkSection   3.16
Pod::Text              3.15
Pod::Usage             1.36
Test::Builder          0.98
Test::Builder::Module   0.98
Test::More             0.98
XSLoader               0.15
base                   2.15
bytes                  1.04
integer                1.00
overload               1.10
vars                   1.01
warnings               1.09
warnings::register     1.01

regex - 为什么这个正则表达式不会从 Pod::Usage 文本中删除最后的空格？

2 回答 2

Related

Reference