php - 如何将 Html 代码转换为相关的 Unicode 字符

Question

其实我google了很多，我也探索过这个论坛，但是这是我的第二天，我找不到解决方案。

我的问题是我想转换 Html 代码

&#1576;&#1575;&#1582;

到其相等的 unicode 字符

خ ا ب

实际上我不想将所有的 html 符号转换为 unicode 字符。我只想将阿拉伯语/乌尔都语 html 代码转换为 unicode 字符。这些字符的范围是from ؛ To ۹如果没有任何 PHP 函数，那么我怎样才能一次性用它们相等的 unicode 字符替换这些代码？

score 4 · Accepted Answer

我想你正在寻找：

html_entity_decode('&#1576;&#1575;&#1582;', ENT_QUOTES, 'UTF-8');

当你从 ب 到 ب，这叫解码。做相反的事情称为编码。

至于仅替换 ؛ 中的字符到 ۹ 也许尝试这样的事情。

<?php

// Random set of entities, two are outside the 1563 - 1785 range.
$entities = '&#1563;&#1564;&#60;&#1604;&#241;&#1784;&#1785;';

// Matches entities from 1500 to 1799, not perfect, I know.
preg_match_all('/&#1[5-7][0-9]{2};/', $entities, $matches);

$entityRegex = array(); // Will hold the entity code regular expression.
$decodedCharacters = array(); // Will hold the decoded characters.

foreach ($matches[0] as $entity)
{
    // Convert the entity to human-readable character.
    $unicodeCharacter = html_entity_decode($entity, ENT_QUOTES, 'UTF-8');

    array_push($entityRegex, "/$entity/");
    array_push($decodedCharacters, $unicodeCharacter);
}

// Replace all of the matched entities with the human-readable character.
$replaced = preg_replace($entityRegex, $decodedCharacters, $entities);

?>

这是我能解决的最接近的问题。希望这会有所帮助。我现在是早上 5:00，所以我要睡觉了！:)

score 0 · Accepted Answer

您是否尝试过 html 头中的 utf-8 编码？

<meta http-equiv="Content-type" content="text/html; charset=utf-8" />

score 0 · Accepted Answer

尝试这个

 <?php
$trans_tbl = get_html_translation_table(HTML_ENTITIES);
foreach($trans_tbl as $k => $v)
{
    $ttr[$v] = utf8_encode($k);
}
$text = '&#1576;&#1576;....;&#1582';
$text = strtr($text, $ttr);
echo $text;
 ?>

对于 mysql 解决方案，您可以将字符集设置为

 $mysqli = new mysqli($host, $user, $pass, $db);

   if (!$mysqli->set_charset("utf8")) {
    die("error");

    }

php - 如何将 Html 代码转换为相关的 Unicode 字符

3 回答 3

Related

Reference