0

我有一个例子:

<a href="http://test.html" class="watermark" target="_blank">
   <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
</a>

我使用 preg_replace 更改标签的值类和 img 标签的 src

$content = preg_replace('#<a(.*?)href="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)><img(.*?)src="([^"]*/)?(([^"/]*)\.[^"]*)"([^>]*?)></a>#', '<a href=$2$3 class="fancybox"><img$1src="http://test.html/uploads/2013/10/10_new.jpg"></a>', $content); 

结果如何?

<a href="http://test.html" class="fancybox" target="_blank">
    <img width="399" height="4652" src="http://test.html/uploads/2013/10/10_new.jpg" class="aligncenter size-full wp-image-78360">
</a>
4

2 回答 2

1

正则表达式,正如每天在 SO 上多次提到的那样,并不是 HTML 操作的最佳工具——幸运的是,我们有 DOMDocument 对象!

如果您只提供了该字符串,则可以进行如下更改:

$orig = '   <a href="http://test.html" class="watermark" target="_blank">
                <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
        </a>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchor = $doc->getElementsByTagName('a')->item(0);
if($anchor->getAttribute('class') == 'watermark')
{
    $anchor->setAttribute('class','fancybox');
    $img = $anchor->getElementsByTagName('img')->item(0);
    $currSrc = $img->getAttribute('src');
    $img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
}
$newStr = $doc->saveHTML($anchor);

否则,如果您使用的是完整的文档 HTML 源代码:

$orig = '<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title></title>
</head>
<body>
    <a href="http://test.html" class="watermark" target="_blank">
        <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
    </a>
    <span>random</span>
    <a href="http://test.html" class="watermark" target="_blank">
        <img width="399" height="4652" src="http://test.html/uploads/2013/10/10.jpg" class="aligncenter size-full wp-image-78360">
    </a>
    <a href="#foobar" class="gary">
        <img src="/imgs/yay.png" />
    </a>
</body>
</html>';
$doc = new DOMDocument();
$doc->loadHTML($orig);
$anchors = $doc->getElementsByTagName('a');
foreach($anchors as $anchor)
{
    if($anchor->getAttribute('class') == 'watermark')
    {
        $anchor->setAttribute('class','fancybox');
        $img = $anchor->getElementsByTagName('img')->item(0);
        $currSrc = $img->getAttribute('src');
        $img->setAttribute('src',preg_replace('/(\.[^\.]+)$/','_new$1',$currSrc));
    }
}
$newStr = $doc->saveHTML();

虽然对于大脑锻炼,我提供了一个正则表达式解决方案,因为这是最初的问题,有时DOM 文档可能会过度使用大量代码(尽管仍然更可取)

$newStr = preg_replace('#<a(.+?)class="watermark"(.+?)<img(.+?)src="(.+?)(\.[^.]+?)"(.*?>.*?</a>)#s','<a$1class="fancybox"$2<img$3src="$4_new$5"$6',$orig);
于 2013-10-10T08:45:03.330 回答
0

不要使用正则表达式解析 HTML。

查找 html 中所有具有watermark类的链接,将类更改为fancybox并更新第一个子图像src

$dom = new DOMDocument;
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//a[contains(@class, "watermark")]') as $a) {
    $a->setAttribute('class', 'fancybox');

    $img = $xpath->query('descendant::img', $a)->item(0);
    # old value = $img->getAttribute('src');
    $img->setAttribute('src', 'new_value');
}
echo $dom->saveHTML();
于 2013-10-10T08:49:38.367 回答