php - PHP正则表达式删除HTML文档中的标签

Question

说我有以下文字

..(content).............
<A HREF="http://foo.com/content" >blah blah blah </A>
...(continue content)...

我想删除链接，我想删除标签（同时保留文本）。如何使用正则表达式执行此操作（因为 URL 都会不同）

非常感谢

score 17 · Accepted Answer

这将删除所有标签：

preg_replace("/<.*?>/", "", $string);

这将只删除<a>标签：

preg_replace("/<\\/?a(\\s+.*?>|>)/", "", $string);

score 16 · Accepted Answer

尽可能避免使用正则表达式，尤其是在处理 xml 时。在这种情况下，您可以使用strip_tags()或simplexml，具体取决于您的字符串。

score 4 · Accepted Answer

<?php
//example to extract the innerText from all anchors in a string
include('simple_html_dom.php');

$html = str_get_html('<A HREF="http://foo.com/content" >blah blah blah </A><A HREF="http://foo.com/content" >blah blah blah </A>');

//print the text of each anchor    
foreach($html->find('a') as $e) {
    echo $e->innerText;
}
?>

请参阅PHP 简单 DOM 解析器。

score 3 · Accepted Answer

不漂亮，但可以完成工作：

$data = str_replace('</a>', '', $data);
$data = preg_replace('/<a[^>]+href[^>]+>/', '', $data);

score 1 · Accepted Answer

1

strip_tags()也可以使用。

请在此处查看示例。

于 2012-07-09T07:11:30.927 回答

score 0 · Accepted Answer

我用它来用文本字符串替换锚点......

function replaceAnchorsWithText($data) {
        $regex  = '/(<a\s*'; // Start of anchor tag
        $regex .= '(.*?)\s*'; // Any attributes or spaces that may or may not exist
        $regex .= 'href=[\'"]+?\s*(?P<link>\S+)\s*[\'"]+?'; // Grab the link
        $regex .= '\s*(.*?)\s*>\s*'; // Any attributes or spaces that may or may not exist before closing tag
        $regex .= '(?P<name>\S+)'; // Grab the name
        $regex .= '\s*<\/a>)/i'; // Any number of spaces between the closing anchor tag (case insensitive)

        if (is_array($data)) {
            // This is what will replace the link (modify to you liking)
            $data = "{$data['name']}({$data['link']})";
        }
        return preg_replace_callback($regex, array('self', 'replaceAnchorsWithText'), $data);
    }

score 0 · Accepted Answer

0

$pattern = '/href="([^"]*)"/';

于 2013-07-13T00:52:31.897 回答

score -2 · Accepted Answer

-2

使用 str_replace

于 2009-09-01T22:42:14.283 回答

php - PHP正则表达式删除HTML文档中的标签

8 回答 8

Related

Reference