php - htmlentities() 字符串中的双重编码实体

Question

我只想将未编码的字符转换为 html 实体，而不影响已经存在的实体。我有一个先前已编码实体的字符串，例如：

gaIUSHIUGhj>&hyphen; hjb&times;jkn.jhuh>hh> &hellip;

当我使用时htmlentities()，&实体开头的再次被编码。这意味着&hyphen;和其他实体的&编码为&：

&amp;times;

我尝试解码完整的字符串，然后再次对其进行编码，但它似乎无法正常工作。这是我试过的代码：

header('Content-Type: text/html; charset=iso-8859-1');
...

$b = 'gaIUSHIUGhj>&hyphen; hjb&times;jkn.jhuh>hh> &hellip;';
$b = html_entity_decode($b, ENT_QUOTES, 'UTF-8');
$b = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $b);
$b = htmlentities($b, ENT_QUOTES, 'UTF-8');

但它似乎并没有以正确的方式工作。有没有办法防止或阻止这种情况发生？

score 6 · Accepted Answer

将可选$double_encode变量设置为false。有关更多信息，请参阅文档。

您生成的代码应如下所示：

$b = htmlentities($b, ENT_QUOTES, 'UTF-8', false);

score 5 · Accepted Answer

您很好地查看了文档，但您错过了最好的部分。有时可能很难破译：

//     >    >    >    >    >    >    Scroll    >>>    >    >    >    >    >     Keep going.    >    >    >    >>>>>>  See below.  <<<<<<
string htmlentities ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = 'UTF-8' [, bool $double_encode = true ]]] )

^{看看最后。}

我知道。令人困惑。我通常会忽略签名行并直接进入下一个块（Parameters）以获取每个参数的简介。

所以你想double_encoded在最后使用参数来告诉htmlentities不要重新编码（你可能想坚持下去，UTF-8除非你有特定的理由不这样做）：

$str = "gaIUSHIUGhj>&hyphen; hjb&times;jkn.jhuh>hh> &hellip;";

// Double-encoded!
echo htmlentities($str, ENT_COMPAT, 'utf-8', true) . "\n";

// Not double-encoded!
echo htmlentities($str, ENT_COMPAT, 'utf-8', false);

https://ignite.io/code/513ab23bec221e4837000000

php - htmlentities() 字符串中的双重编码实体

2 回答 2

Related

Reference