PHP 中是否有任何函数可以将重音字符(例如法语中的字符)转换为 HTML 编码的字符?
3 回答
您唯一需要做的就是确保它们是有效的 UTF8 并设置适当的 content-tyoe 标头 ( text/html; charset=utf-8
)。
现在没有理由为这些字符使用 HTML 实体。
您编写 HTML 编码的字符,所以我假设您想将字符转换为HTML Entitites Ref 。这些是在 HTML 2(ISO Latin 1 Character Entity Set)、HTML 3.2(Character Entities for ISO Latin-1)和 HTML 4(HTML 4中的字符实体引用)中引入的。
您尚未共享您正在使用的那些 HTML 版本,因此我建议您查看XML 列表和 HTML 字符实体引用Wikidpedia 以找到您想要替换的那些。
与这些相关的 PHP 函数称为:文档。htmlentities
多亏了 HTTP 中的 content-type 标头并且它在 HTML中是等效的,所以没有必要将这些字符编码为实体,因为您可以告诉浏览器您正在使用哪个字符集。您只需要使用实体,以防字符不是您用于输出/响应的编码的一部分。
对于这些情况 文档或 可以使用文档功能。由于您尚未指定数据和目标所涉及的编码,因此无法给出具体的代码示例,而只能给出一般示例:htmlentities
strtr
echo htmlentities ($string, ENT_HTML401, $encoding = 'YOUR STRING ENCODING');
ENT_HTML401
翻译表Docs将转换的字符多于您可能要求的字符。
除了使用内置转换表,您还可以创建自己的转换表并使用 文档功能。如果 不支持您的数据编码,例如 Adobe 符号字体(请参阅:如何将符号字体转换为标准 utf8 HTML 实体),则也需要这样做。或者因为您只想运行自己的转换(请参阅如何使用 PHP 替换字符串中的非 SGML 字符?)。strtr
htmlentities
/*
* mappings of Windows-1252 (cp1252) 128 (0x80) - 159 (0x9F) characters:
* @link http://en.wikipedia.org/wiki/Windows-1252
* @link http://www.w3.org/TR/html4/sgml/entities.html
*/
$cp1252HTML401Entities = array(
"\x80" => '€', # 128 -> euro sign, U+20AC NEW
"\x82" => '‚', # 130 -> single low-9 quotation mark, U+201A NEW
"\x83" => 'ƒ', # 131 -> latin small f with hook = function = florin, U+0192 ISOtech
"\x84" => '„', # 132 -> double low-9 quotation mark, U+201E NEW
"\x85" => '…', # 133 -> horizontal ellipsis = three dot leader, U+2026 ISOpub
"\x86" => '†', # 134 -> dagger, U+2020 ISOpub
"\x87" => '‡', # 135 -> double dagger, U+2021 ISOpub
"\x88" => 'ˆ', # 136 -> modifier letter circumflex accent, U+02C6 ISOpub
"\x89" => '‰', # 137 -> per mille sign, U+2030 ISOtech
"\x8A" => 'Š', # 138 -> latin capital letter S with caron, U+0160 ISOlat2
"\x8B" => '‹', # 139 -> single left-pointing angle quotation mark, U+2039 ISO proposed
"\x8C" => 'Œ', # 140 -> latin capital ligature OE, U+0152 ISOlat2
"\x8E" => 'Ž', # 142 -> U+017D
"\x91" => '‘', # 145 -> left single quotation mark, U+2018 ISOnum
"\x92" => '’', # 146 -> right single quotation mark, U+2019 ISOnum
"\x93" => '“', # 147 -> left double quotation mark, U+201C ISOnum
"\x94" => '”', # 148 -> right double quotation mark, U+201D ISOnum
"\x95" => '•', # 149 -> bullet = black small circle, U+2022 ISOpub
"\x96" => '–', # 150 -> en dash, U+2013 ISOpub
"\x97" => '—', # 151 -> em dash, U+2014 ISOpub
"\x98" => '˜', # 152 -> small tilde, U+02DC ISOdia
"\x99" => '™', # 153 -> trade mark sign, U+2122 ISOnum
"\x9A" => 'š', # 154 -> latin small letter s with caron, U+0161 ISOlat2
"\x9B" => '›', # 155 -> single right-pointing angle quotation mark, U+203A ISO proposed
"\x9C" => 'œ', # 156 -> latin small ligature oe, U+0153 ISOlat2
"\x9E" => 'ž', # 158 -> U+017E
"\x9F" => 'Ÿ', # 159 -> latin capital letter Y with diaeresis, U+0178 ISOlat2
);
$outputWithEntities = strtr($output, $cp1252HTML401Entities);
来源:http ://coding.smashingmagazine.com/2011/11/02/introduction-to-url-rewriting/
试试这个功能:
function GenerateUrl ($s) {
//Convert accented characters, and remove parentheses and apostrophes
$from = explode (',', "ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u,(,),[,],'");
$to = explode (',', 'c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u,,,,,,');
//Do the replacements, and convert all other non-alphanumeric characters to spaces
$s = preg_replace ('~[^\w\d]+~', '-', str_replace ($from, $to, trim ($s)));
//Remove a - at the beginning or end and make lowercase
return strtolower (preg_replace ('/^-/', '', preg_replace ('/-$/', '', $s)));
}