0
4

1 回答 1

2

mb_convert_encoding was a good idea, but it does not work as expected because DOMDocument seems to be a little big buggy when it comes to encoding.

Moving the mb_convert_encoding to the actual node output did the trick.

$html_dom = new DOMDocument();
$html_dom->resolveExternals = TRUE;
@$html_dom->loadHTML($html_doc);
$xpath = new DOMXPath($html_dom);

$query   = '//div[@class="foo"]/div/p';
$my_foos = $xpath->query($query);
foreach ($my_foos as $my_foo)
{
    echo mb_convert_encoding($my_foo->nodeValue, 'HTML-ENTITIES', 'UTF-8');
    die;
}
于 2013-10-07T21:44:38.503 回答