0

我的代码如下所示:

s/(["\'])(?:\\?+.)*?\1/(my $x = $&) =~ s|^(["\'])(.*src=)([\'"])\/|$1$2$3$1.\\$baseUrl.$1\/|g;$x/ge

忽略最后一点(只留下出现问题的部分),代码变为:

s/(["\'])(?:\\?+.)*?\1/replace-text-here/g

我已经尝试使用两者,但我仍然遇到同样的问题,即即使我使用 g 修饰符,这个正则表达式只匹配并替换第一次出现。如果这是一个 Perl 错误,我不知道,但我使用的正则表达式匹配两个引号之间的所有内容,并且还处理转义引号,我正在关注这篇博文。在我看来,该正则表达式应该匹配两个引号之间的所有内容,然后替换它,然后尝试找到该模式的另一个实例,因为 g 修饰符。

对于一些背景信息,我没有使用和版本声明,并且打开了严格和警告,但没有出现任何警告。我的脚本将整个文件读入一个标量(包括换行符),然后正则表达式直接在该标量上运行。它似乎确实在每一行上单独工作 - 只是不是在一行上多次。Perl 版本 5.14.2,在 Cygwin 64 位上运行。可能是 Cygwin(或 Perl 端口)搞砸了,但我对此表示怀疑。

我还尝试了该博客文章中的另一个示例,将原子组和所有格量词替换为等效代码但没有这些功能,但这个问题仍然困扰着我。

例子:

<?php echo ($watched_dir->getExistsFlag())?"":"<span class='ui-icon-alert'><img src='/css/images/warning-icon.png'></span>"?>
Should become (with the shortened regex):
<?php echo ($watched_dir->getExistsFlag())?replace-text-here:replace-text-here?>
Yet it only becomes:
<?php echo ($watched_dir->getExistsFlag())?replace-text-here:"<span class='ui-icon-alert'><img src='/css/images/warning-icon.png'></span>"?>

<?php echo ($sub->getTarget() != "")?"target=\"".$sub->getTarget()."\"":""; ?>
Should become:
<?php echo ($sub->getTarget() != replace-text-here)?replace-text-here.$sub->getTarget().replace-text-here:replace-text-here; ?>
And as above, only the first occurrence is changed.

(是的,我确实意识到这会引发某种 - 不要使用正则表达式来解析 HTML/PHP。但在这种情况下,我认为正则表达式更合适,因为我不是在寻找上下文,我正在寻找对于字符串(引号内的任何内容)并对该字符串执行操作 - 这是正则表达式。)

只是一个注释 - 这些正则表达式在 eval 函数中运行,实际的正则表达式被编码在一个单引号字符串中(这就是单引号被转义的原因)。我将直接尝试任何提出的解决方案,以排除我糟糕的编程。

编辑:根据要求,提出问题的简短脚本:

#!/usr/bin/perl -w

use strict;

my $data = "this is the first line, where nothing much happens
but on the second line \"we suddenly have some double quotes\"
and on the third line there are 'single quotes'
but the fourth line has \"double quotes\" AND 'single quotes', but also another \"double quote\"
the fifth line has the interesting one - \"double quoted string 'with embedded singles' AND \\\"escaped doubles\\\"\"
and the sixth is just to say - we need a new line at the end to simulate a properly structured file
";
my $regex = 's/(["\'])(?:\\?+.)*?\1/replaced!/g';
my $regex2 = 's/([\'"]).*?\1/replaced2!/g';

print $data."\n";
$_ = $data; # to make the regex operate on $_, as per the original script
eval($regex);
print $_."\n";
$_ = $data;
eval($regex2);
print $_; # just an example of an eval, but without the fancy possessive quantifiers

这为我产生了以下输出:

this is the first line, where nothing much happens
but on the second line "we suddenly have some double quotes"
and on the third line there are 'single quotes'
but the fourth line has "double quotes" AND 'single quotes', but also another "double quote"
the fifth line has the interesting one - "double quoted string 'with embedded singles' AND \"escaped doubles\""
and the sixth is just to say - we need a new line at the end to simulate a properly structured file

this is the first line, where nothing much happens
but on the second line "we suddenly have some double quotes"
and on the third line there are 'single quotes'
but the fourth line has "double quotes" AND 'single quotes', but also another "double quote"
the fifth line has the interesting one - "double quoted string 'with embedded singles' AND \"escaped doubles\replaced!
and the sixth is just to say - we need a new line at the end to simulate a properly structured file

this is the first line, where nothing much happens
but on the second line replaced2!
and on the third line there are replaced2!
but the fourth line has replaced2! AND replaced2!, but also another replaced2!
the fifth line has the interesting one - replaced2!escaped doubles\replaced2!
and the sixth is just to say - we need a new line at the end to simulate a properly structured file
4

2 回答 2

1

更新:这个:

my $regex = 's/(["\'])(?:\\?+.)*?\1/replaced!/g';

应该:

my $regex = 's/(["\'])(?:\\\\?+.)*?\1/replaced!/g';

因为作业中的那些单引号变成\\\,并且您希望正则表达式以\\.

请将您的问题归结为一个演示问题的简短脚本(包括输入、错误输出、评估和所有)。采取你所做的并尝试它:

use strict;
use warnings;
my $input = <<'END';
<?php echo ($watched_dir->getExistsFlag())?"":"<span class='ui-icon-alert'><img src='/css/images/warning-icon.png'></span>"?>
END

(my $output = $input) =~ s/(["\'])(?:\\?+.)*?\1/replace-text-here/g;
print $input,"becomes\n",$output;

为我生产:

<?php echo ($watched_dir->getExistsFlag())?"":"<span class='ui-icon-alert'><img src='/css/images/warning-icon.png'></span>"?>
becomes
<?php echo ($watched_dir->getExistsFlag())?replace-text-here:replace-text-here?>

正如我所料。它对你有什么作用?

于 2012-11-04T18:05:25.570 回答
1

即使在单引号内,也会\\被处理为\,所以:

my $regex = 's/(["\'])(?:\\?+.)*?\1/replaced!/g';

设置$regex为:

s/(["'])(?:\?+.)*?\1/replaced!/g

这要求引用字符串中的每个字符前面都有一个或多个文字问号 ( \?+)。由于您没有很多问号,这实际上意味着您要求字符串为空,要么""要么''

最小的修复是添加更多的反斜杠:

my $regex = 's/(["\'])(?:\\\\?+.)*?\\1/replaced!/g';

但你真的可能想重新考虑你的方法。您真的需要将整个 regex-replacement 命令保存为字符串并通过 运行它eval吗?

于 2012-11-04T18:30:01.300 回答