php - PHP strtr 根本不起作用

Question

score 3 · Accepted Answer

strtr用函数原型

string strtr ( string $str , string $from , string $to )

仅适用于单字节编码（例如 ISO-8859-1）。

header("Content-Type: text/plain; charset=ISO-8859-1");
$str = "\x2d\xe4\xe5\xf6\x2d"; // ISO-8859-1: -äåö-
$from = "\xe4\xe5\xf6";        // ISO-8859-1: äåö
$to = "\x78\x78\x78";          // ISO-8859-1: xxx
dump($str, "ISO-8859-1");  // length in octets: 5
dump($from, "ISO-8859-1"); // length in octets: 3
dump($to, "ISO-8859-1");   // length in octets: 3

print strtr($str, $from, $to); // -xxx-

输出：

-: 2d
ä: e4
å: e5
ö: f6
-: 2d
length (encoding: ISO-8859-1): 5
length in octets (8-bit-byte): 5

ä: e4
å: e5
ö: f6
length (encoding: ISO-8859-1): 3
length in octets (8-bit-byte): 3

x: 78
x: 78
x: 78
length (encoding: ISO-8859-1): 3
length in octets (8-bit-byte): 3

-xxx-

如果您使用多字节字符，例如来自 UTF-8，您可能会得到一个混乱的字符串：

header("Content-Type: text/plain; charset=UTF-8");
$str = "\x2d\xc3\xa4\xc3\xa5\xc3\xb6\x2d"; // UTF-8: -äåö-
$from = "\xc3\xa4\xc3\xa5\xc3\xb6";        // UTF-8: äåö
$to = "\x78\x78\x78";                      // UTF-8: xxx
dump($str, "UTF-8");  // length in octets: 8
dump($from, "UTF-8"); // length in octets: 6
dump($to, "UTF-8");   // length in octets: 3

// > If from and to have different lengths, the extra characters in the longer
// > of the two are ignored. The length of str will be the same as the return
// > value's.
// http://de.php.net/manual/en/function.strtr.php

// This means that the $from-string gets cropped to "\xc3\xa4\xc3" (16 bit of
// the first char [ä] and the first 8 bit of the second char [å]):
strtr($str, $from, $to) === strtr($str, "\xc3\xa4\xc3", $to); // true
print strtr($str, $from, $to); // -xxx�x�-

输出：

-: 2d
ä: c3a4
å: c3a5
ö: c3b6
-: 2d
length (encoding: UTF-8): 5
length in octets (8-bit-byte): 8

ä: c3a4
å: c3a5
ö: c3b6
length (encoding: UTF-8): 3
length in octets (8-bit-byte): 6

x: 78
x: 78
x: 78
length (encoding: UTF-8): 3
length in octets (8-bit-byte): 3

-xxx�x�-

对于像 UTF-8 这样的多字节编码，您必须使用第二个函数原型：

string strtr ( string $str , array $replace_pairs )

header("Content-Type: text/plain");
$str = "-äåö-"; // UTF-8 \x2d\xc3\xa4\xc3\xa5\xc3\xb6\x2d
$replace_pairs = array(
    "ä" /* UTF-8 \xc3\xa4 */ => "x",
    "å" /* UTF-8 \xc3\xa5 */ => "x",
    "ö" /* UTF-8 \xc3\xb6 */ => "x"
);
print strtr($str, $replace_pairs); // -xxx-

如果编码不匹配，您必须使用iconv转换它们：

header("Content-Type: text/plain");
$str = "\x2d\xe4\xe5\xf6\x2d"; // ISO-8859-1 -äåö-
$str = iconv("ISO-8859-1", "UTF-8", $str);
$replace_pairs = array(
    "ä" /* UTF-8 \xc3\xa4 */ => "x",
    "å" /* UTF-8 \xc3\xa5 */ => "x",
    "ö" /* UTF-8 \xc3\xb6 */ => "x"
);
print strtr($str, $replace_pairs); // -xxx-

函数转储：

// outputs the hexvalue for each char for the given encoding
function dump($data, $encoding) {
    for($i = 0, $len = iconv_strlen($data, $encoding); $i < $len; ++$i) {
        $char = iconv_substr($data, $i, 1, $encoding);
        printf("%s: %s\n", $char, bin2hex($char));
    }
    printf("length (encoding: %s): %d\n", $encoding, $len);
    printf("length in octets (8-bit-byte): %d\n\n", strlen($data));
}

score 1 · Accepted Answer

听起来您可能有竞争编码。如果您的浏览器正在提交 UTF8，但您的文件保存在（例如）8859-1 中，则您的字符将不匹配并且翻译将失败。此外，查看文档页面，有几条评论建议utf8_decode()先在您的输入字符串上使用。它utf8_decode()本身可能会做你想做的事。

UTF8是一种多字节编码（实际上是一种可变字节编码）。诸如÷或ï具有超过 256 的 Unicode 代码点的字符需要被编码为两个或多个字节，均高于 128，以标识该字符。我怀疑你将不得不了解更多关于 Unicode 的知识。处还有另一种解释utf8_encode。

编辑：自从我与编码搏斗已经有一段时间了。您应该查看iconv()更通用的重新编码。

score 1 · Accepted Answer

你试过mb_strstr：http ://php.net/manual/en/function.mb-strstr.php

该函数支持多字节字符编码。

php - PHP strtr 根本不起作用

3 回答 3

Related

Reference