2

有一个字符串(仅用于测试),我想替换<p>div 下的所有实例,<div id="text">. 我怎么做 ?

我用mands修饰符进行了测试,但徒劳无功(只有第一个被替换)。我在下面给出了我的 Perl 代码:

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s/(<div id="text">.*?)<p>(.*)(<\/p>.*?<\/div>)/$1<p style="text-align:left;">$2 $3/sig;

print "Sring now is : \n$string\n\n";
4

4 回答 4

2

使用XML::XSH2

open :F html 1.html ;
for //div[@id="text"]/p
    set @style "text-align:left;" ;
save :b ;
于 2012-05-28T09:35:57.627 回答
0

试试这个

(?is)<p>.+?</p>(?=.*?</div>)

代码

$subject =~ s!(?is)<p>.+?</p>(?=.*?</div>)!!g;

解释

"
(?is)        # Match the remainder of the regex with the options: case insensitive (i); dot matches newline (s)
<p>          # Match the characters “&lt;p>” literally
.            # Match any single character
   +?           # Between one and unlimited times, as few times as possible, expanding as needed (lazy)
</p>         # Match the characters “&lt;/p>” literally
(?=          # Assert that the regex below can be matched, starting at this position (positive lookahead)
   .            # Match any single character
      *?           # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
   </div>       # Match the characters “&lt;/div>” literally
)
"

更新

更改您的代码如下:

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s!(?is)<p>.+?</p>(?=.*?</div>)!!g;;

print "Sring now is : \n$string\n\n";

该脚本准确地给出了构建的目的。显示除<p>中的元素之外的所有内容div

于 2012-05-28T08:15:55.560 回答
0

首先我需要说我使用了这篇文章中解释的技巧在 Perl 中将正则表达式替换作为变量传递?

#!/usr/bin/perl
use strict;
use warnings;

my $string = <<STRING;
<div id="main">
    hellohello
    <div id="text">
        nokay.
        <p>This is p1, SHUD B replaced</p>
        Alright
        <p>This is p2, SHUD B replaced</p>
        Yes 2
        <p>this is P3, SHUD B replaced</p>
        Okay done
        bye
    </div>
    bye
    <p>this is not under the div whose id is text and SHUDN'T b replaced</p>
</div>

STRING

my $str_bak = $string;
print "Sring is : \n$string\n\n";

$string =~ s/(<div id="text">.*?)<p>(.*)(<\/p>.*?<\/div>)/$1<p style="text-align:left;">$2 $3/sig;

sub modify
{
  my($text, $code) = @_;
  $code->($text);
  return $text;
}

my $new_text = modify($string, sub {
    my $div = '(<div id="text">.*?</div>)';
    $string =~ m#$div#is;
    my $found = $1;
print "found : \n$found\n\n";
    my $repl = modify ($found, sub {
         $_[0] =~ s/<p>/<p style="text-align:left;">/g
    }) ;
    $_[0] =~ s/$found/$repl/ 
});

print "Result : \n$new_text\n\n";

诀窍是使用修改子来允许对文本进行高阶处理。然后我们可以隔离<div id="text">...</div>并对其应用替换<p>

于 2012-05-28T09:19:19.403 回答
0

谢谢大家的帮助。

我可以找到一个正则表达式。所以我用“解决方法”做到了。这是如何:

while( $val =~ s/(<div id="article">.*?)<p>/$1<p style="text-align:left;">/sig )
{  }

所以基本上那个正则表达式只适用于第一个匹配,这就是为什么我们在一个空的while循环中重复它(当没有更多匹配替换时循环退出)。

于 2012-05-31T05:02:17.087 回答