php - PHP 函数 substr() 错误

Question

当我使用substr()时，最后会得到一个奇怪的字符

$articleText = substr($articleText,0,500);

我有 500 个字符的输出和 � <--

我怎样才能解决这个问题？是编码问题吗？我的语言是希腊语。

score 59 · Accepted Answer

substr使用字节计数，而不是字符。

greek 可能意味着您正在使用一些多字节编码，例如 UTF-8 - 并且按字节计数对那些不太好。

也许mb_substr在这里使用可能会有所帮助：这些mb_*函数是专门为多字节编码创建的。

score 20 · Accepted Answer

相反mb_substr，它能够处理多种编码，而不仅仅是单字节字符串substr：

$articleText = mb_substr($articleText,0,500,'UTF-8');

score 6 · Accepted Answer

看起来你正在将一个 unicode 字符切成两半。改为用于mb_substrunicode 安全的字符串切片。

score 1 · Accepted Answer

UTF-8 编码字符串的替代解决方案 - 这将在剪切子字符串之前将 UTF-8 转换为字符。

$articleText = substr(utf8_decode($articleText),0,500);

要将 articleText 字符串恢复为 UTF-8，需要进行额外的操作：

$articleText = utf8_encode( substr(utf8_decode($articleText),0,500) );

score 1 · Accepted Answer

使用这个功能，它对我有用

function substr_unicode($str, $s, $l = null) {
    return join("", array_slice(
        preg_split("//u", $str, -1, PREG_SPLIT_NO_EMPTY), $s, $l));
}

score 0 · Accepted Answer

ms_substr() 也非常适用于删除奇怪的尾随换行符，我在解析 html 代码后遇到了麻烦。该问题未由以下人员处理：

 trim()

或者：

 var_dump(preg_match('/^\n|\n$/', $variable));

或者：

str_replace (array('\r\n', '\n', '\r'), ' ', $text)

不要抓。

score 0 · Accepted Answer

您正在尝试剪切 unicode 字符。所以我更喜欢在 php中尝试而不是substr()尝试。mb_substr()

substr()

substr ( string $string , int $start [, int $length ] )

mb_substr()

mb_substr ( string $str , int $start [, int $length [, string $encoding ]] )

有关 substr() 的更多信息 - Credits => Check Here

7 回答 7