0

在将子模式占位符“$1”传递给变量中的替换运算符“s///”时,我遇到了正确引用子模式占位符的问题。有人可以对此有所了解并建议我,我做错了什么吗?

我正在将一组 MS Word 文档导出为 HTML 文件。除了文件包含许多交叉引用并且需要修复这些以继续工作之外,这或多或少都可以。导出的引用采用 'href="../../somefilename.docx"' 的形式,需要将这些更改为 'href="somefilename.htm"' 以引用导出的 html 文件而不是原始 Word 文件.

示例文件 test.htm 可能如下所示:

<html>
<body>
<a href="../../filename1.docx" />
<a href="../../filename2.docx" />
<a href="../../filename3.docx" />
<a href="../../filename4.docx" />
</body>
</html>

然后程序执行应该产生:

<html>
<body>
<a href="filename1.htm" />
<a href="filename2.htm" />
<a href="filename3.htm" />
<a href="filename4.htm" />
</body>
</html>

我写了一个小 Perl 程序“ReplaceURLs”来完成这项工作。它工作正常,如果我“硬编码”模式和替换表达式(即,如果我将模式直接放入 s/.../.../g 语句) - 请参阅变体 1。但为了使其更灵活,我希望允许将这些表达式作为参数传递(即 s/$pattern/$subst/g),而我无法使其正常工作。我可以在变量中传递模式 - 请参见变体 2,但不能包含子模式引用 $1 的替换值。在变体 3 中,由于某种原因,替换值中的 $1 未被识别为子模式标记,而是被视为文字“$”。

#!/usr/bin/perl

$debug = TRUE;
$tgtfilename = $ARGV[0] || die("usage: ReplaceURLs.pl <filename> <url-pattern> <url-substvalue>");
$urlpattern  = $ARGV[1] || "href=\"\.\./\.\./(.*)\.docx\"";  # href="../../(filename).docx';
$urlsubstval = $ARGV[2] || "href=\"\$1.htm\"";  # href="$1.htm" --> href="(filename).htm";

print "replacing all occurences of pattern '$urlpattern' in file '$tgtfilename' with '$urlsubstval':\n";

# open & read $tgtfilename
open($ifh, '<', $tgtfilename) || die "unable to open $tgtfilename for reading: $!";
@slurp = <$ifh>; 
$oldstring = "@slurp";
close($ifh)  || die "can't close file $tgtfilename: $!";
if ($debug) { print $oldstring,"\n"; }

# look for $urlpattern and replace it with $urlsubstval:

# variant 1: works
#($newstring = $oldstring) =~ s!href=\"\.\./\.\./(.*)\.docx\"!href=\"$1.htm\"!g;

# variant 2: works
#($newstring = $oldstring) =~ s!$urlpattern!href=\"$1.htm\"!g; 

# variant : does not work - why?
($newstring = $oldstring) =~ s/$urlpattern/$urlsubstval/g; 

# save file
#open($ofh, '>', $tgtfilename) || die "unable to re-open $tgtfilename for writing";
#print $ofh $newstring,"\n";
#close($ofh) || die "can't close file $tgtfilename: $!";

# done
if ($debug) { print "result of replacement:","\n", $newstring,"\n"; } else { print "done."; }
__END__

如果我使用“perl ReplaceURLs.pl test.htm”运行它,我总是会得到:

<html>
 <body>
 <a href="$1.htm" />
 <a href="$1.htm" />
 <a href="$1.htm" />
 <a href="$1.htm" />
 </body>
 </html>

而不是想要的结果。我如何需要引用或转义 $urlsubstval 中的“$1”才能使其正常工作?

M。

4

2 回答 2

2

perlop

Options are as with m// with the addition of the following replacement specific options:

     e   Evaluate the right side as an expression.
     ee  Evaluate the right side as a string then eval the result.
     r   Return substitution and leave the original string untouched.

所以,相当模糊,

$ ls -1 | perl -pE '$str = q{"--$1--"}; s/(hah)/$str/ee;'
于 2012-11-13T10:05:01.510 回答
0

仅当 $str 不包含任何干扰 Perl 语法的内容时,bobbogo 的解决方案才有效。但是因为我希望替换包含一些偶然看起来像 Perl 赋值的东西,即'href="$1.htm"',这会产生警告'未引用的字符串 "href" 可能与未来的保留字冲突......'以及错误“在...处的替换迭代器中使用未初始化的值”,然后崩溃。

因此,我最终可行的解决方案是使用正确的字符串替换来构造命令,然后对构造的命令进行 eval(...) :

#!/usr/bin/perl

$debug = 1;
$tgtfilename = $ARGV[0] || die("usage: ReplaceURLs.pl <filename> [ <url-pattern> [ <url-substvalue> ] ]");
$urlpattern  = $ARGV[1] || 'href="\.\./\.\./(.*)\.docx"';  # href="../../<filename>.docx"" in regexp format
$urlreplace  = $ARGV[2] || 'href="$1.htm"';  # href="$1.htm" --> href="<filename>.htm"; 

print "replacing all occurences of pattern '$urlpattern' in file '$tgtfilename' with '$urlreplace':\n";

# open & read $tgtfilename
open($ifh, '<', $tgtfilename) || die "unable to open $tgtfilename for reading: $!";
@slurp = <$ifh>; 
$oldstring = "@slurp";
close($ifh)  || die "can't close file $tgtfilename: $!";
if ($debug) { print $oldstring,"\n"; }

# construct command to look for $urlpattern and replace it with $urlreplace:
$newstring = $oldstring;
$cmd = '$newstring =~ s!'.$urlpattern.'!'.$urlreplace.'!g';
# execute it:
if ($debug) { print "cmd=", $cmd, "\n"; }
eval($cmd);

# done
if ($debug) { 
    print "result of replacement:","\n", $newstring,"\n"; 
} else { 
    # save to file:
    open($ofh, '>', $tgtfilename) || die "unable to re-open $tgtfilename for writing";
    print $ofh $newstring,"\n";
    close($ofh) || die "can't close file $tgtfilename: $!";
    print "done."; 
}
__END__
于 2012-11-13T23:47:23.067 回答