对于字符串“aa\nbb\ncc”,我想从最后一个字母到第一个换行符(“a”)匹配到多行字符串的末尾,并期望
"aa\nbb\ncc" =~ qr/( . $ .+ )/xms
火柴a\nbb\ncc
然后
"aa\nbb\ncc\n" =~ qr/( . $ .+ )/xms
匹配 a\nbb\ncc\n
。
但是我没有匹配"aa\nbb\ncc" =~ qr/( . $ .+ )/xms
和c\n
匹配"aa\nbb\ncc" =~ qr/( . $ .+ )/xms
。
使用qr/( . $ ..+ )/xms
我得到了预期的结果(参见示例代码)。
Perl 版本 5.14.2。
谁能解释这种行为?
perldoc perlre:
m Treat string as multiple lines. That is, change "^" and "$"
from matching the start or end of the string to matching the start
or end of any line anywhere within the string.
s Treat string as single line. That is, change "." to match any character
whatsoever, even a newline, which normally it would not match.
Used together, as "/ms", they let the "." match any character whatsoever,
while still allowing "^" and "$" to match, respectively, just after and
just before ewlines within the string.
\z Match only at end of string
运行以下示例代码:
#!/usr/bin/env perl
use strict;
use warnings;
print "Multiline string : ", '"aa\nbb\ncc"', "\n\n";
my $str = "aa\nbb\ncc";
print_match($str, qr/( . $ )/xms); # matches "a"
print_match($str, qr/( . $ . )/xms); # matches "a\n"
print_match($str, qr/( . $ .. )/xms); # matches "a\nb"
print_match($str, qr/( . $ ..+ )/xms); # matches "a\nbb\ncc"
print_match($str, qr/( . $ .+ )/xms); # NO MATCH ! Why ???
print_match($str, qr/( . $ .+ \z )/xms); # NO MATCH ! Why ???
print "\nMultiline string now with terminating newline : ", '"aa\nbb\ncc\n"', "\n\n";
$str = "aa\nbb\ncc\n";
print_match($str, qr/( . $ )/xms); # matches "a"
print_match($str, qr/( . $ . )/xms); # matches "a\n"
print_match($str, qr/( . $ .. )/xms); # matches "a\nb"
print_match($str, qr/( . $ ..+ )/xms); # matches "a\nbb\ncc\n"
print_match($str, qr/( . $ .+ )/xms); # MATCHES "c\n" ! Why ???
print_match($str, qr/( . $ .+ \z)/xms); # MATCHES "c\n" ! Why ???
sub print_match {
my ($str, $regex) = @_;
$str =~ $regex;
if ( $1 ) {
printf "--> %-20s matched : >%s< \n", $regex, $1;
}
else {
printf "--> %-20s : no match !\n", $regex;
}
}
输出是:
Multiline string : "aa\nbb\ncc"
--> (?^msx:( . $ )) matched : >a<
--> (?^msx:( . $ . )) matched : >a
<
--> (?^msx:( . $ .. )) matched : >a
b<
--> (?^msx:( . $ ..+ )) matched : >a
bb
cc<
--> (?^msx:( . $ .+ )) : no match !
Multiline string now with terminating newline : "aa\nbb\ncc\n"
--> (?^msx:( . $ )) matched : >a<
--> (?^msx:( . $ . )) matched : >a
<
--> (?^msx:( . $ .. )) matched : >a
b<
--> (?^msx:( . $ ..+ )) matched : >a
bb
cc
<
--> (?^msx:( . $ .+ )) matched : >c
<