php - PHP HTML 编码

Question

我正在尝试解析 HTML 页面，但编码弄乱了我的结果。经过一番研究，我发现了一个非常流行的解决方案，使用utf8_encode()and utf8_decode()，但它并没有改变任何东西。在以下几行中，您可以检查我的代码和输出。

代码

$str_html = $this->curlHelper->file_get_contents_curl($page);
$str_html = utf8_encode($str_html);

$dom = new DOMDocument();
$dom->resolveExternals = true;
$dom->substituteEntities = false;
@$dom->loadHTML($str_html);
$xpath = new DomXpath($dom);

(...)
$profile = array();
for ($index = 0; $index < $table_lines->length; $index++) {
    $desc = utf8_decode($table_lines->item($index)->firstChild->nodeValue);
}

输出

Testar Ã© bom

应该

Testar é bom

我试过的

htmlentities():

htmlentities($table_lines->item($index)->lastChild->nodeValue, ENT_NOQUOTES, ini_get('ISO-8859-1'), false);
htmlspecialchars():

htmlspecialchars($table_lines->item($index)->lastChild->nodeValue, ENT_NOQUOTES, 'ISO- 8859-1', false);
按照此处的描述更改我的文件的字符集。

更多信息

网站编码：<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" />

提前致谢！

score 3 · Accepted Answer

尝试在没有先验的情况下使用以下内容utf8_decode()：

mb_convert_encoding($str, 'ISO-8859-1', 'UTF-8');

或者，不要使用utf8_decode()并尝试将您的网站元更改为：

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />

mb_convert_encoding()

php - PHP HTML 编码

代码

输出

我试过的

更多信息

1 回答 1

Related

Reference