php - 删除不允许使用 href 的链接

Question

我有一些这样的链接：

<a href="http://illegallink.com"><img src="something.jpg" /><a href="http://legallink.com">legal</a></a>

我想删除所有没有“legallink.com”的链接。但仍然保留内容。所以上面的输入会输出：

<img src="something.jpg" /><a href="http://legallink.com">legal</a>

它应该通过链接递归地工作。

我发现这个删除所有链接的正则表达式：/<\\/?a(\\s+.*?>|>)/，但我希望它保留 href 是 legallink.com 的链接。

这可以用正则表达式完成吗？还是应该使用 DOM 解析器？

score 1 · Accepted Answer

error_reporting(~0); display_errors(1);

$code = '<a href="http://illegallink.com"><img src="something.jpg" /><a href="http://legallink.com">legal</a></a>';

$document = new DOMDocument(); 
$document->loadHTML($code); 
$parser = new DOMXPath($document);  

foreach($parser->query("//a") as $node)  
{ 
  if (!preg_match("/^http:\/\/legallink.com/i", $node->getAttribute("href")))
  {
    $node->parentNode->replaceChild($node->nodeValue, $node);
  }
}
echo $document->saveXML();

php - 删除不允许使用 href 的链接

1 回答 1

Related

Reference