php - html中的子字符串

Question

<p style="color:red;font-size:12px;">This economy car is great value for money and with the added benefit of air conditioning is ideal for couples and small families. A ?500 excess applies which can be waived to NIL for only <b>5.00</b> per day</p>

使用以下 2 种方法

substr($mytext,0,25);

和

 $s = html_entity_decode($mytext);
 $sub = substr($s, 0, 50);�

需要获得前 50 个字符...任何人请帮忙

谢谢

score 2 · Accepted Answer

您需要一个 HTML Paser，找到并读出纯文本并选择子字符串，这里有一个示例DOMXpath：

$doc = DOMDocument::loadHTML($html);
$xp = new DOMXPath($doc);
$chars50 = $xp->evaluate('substring(normalize-space(//body),1,50)');

演示：

字符串(50)"This economy car is great value for money and with"

请注意，您将在此处获得一个 UTF-8 编码的字符串。您也可以使用正则表达式（这可能会帮助您减少 words）自己执行此操作，例如：

# load text from HTML
$text = DOMDocument::loadHTML($html)->getElementsByTagName('body')->item(0)->nodeValue;

# normalize HTML whitspace
$text = trim(preg_replace('/\s{1,}/u', ' ', $text));

# obtain the substring (here: UTF-8 safe operation, see as well mb_substr)
$chars50 = preg_replace('/^(.{0,50}).*$/u', '$1', $text);

演示

如果您使用strip_tags而不是 HTML 解析器，则需要自己处理不同的编码。由于原始字符串已经具有表示 unicode 替换字符的问号，我会说您已经处理了 borked 数据，因此最好使用重新呈现的库DOMDocument而strip_tags不是不安全的库（请参阅 PHP 上的警告手册页）。

score 1 · Accepted Answer

希望这行得通...试试吧

echo (substr(strip_tags($mytext), 0, 25));

http://www.ideone.com/6TgJX

php - html中的子字符串

2 回答 2

Related

Reference