0

我尝试使用 mb_* 函数在字符串中查找空格。它适用于拉丁字符,但不适用于中文...我尝试将 utf8_encode 和 iconv 转换为 utf-8,因为其他线程表明这可能是问题所在。

给出错误的函数是 mb_strpos,它对中文不返回任何内容,但对英文和其他基于拉丁字符的文本返回有效的 int。

我在编码方面并不是那么强,但假设不同的编码导致了这个问题。所以我最想在哪里搜索寻求帮助,因为 php 函数似乎没有任何问题。

这适用于英文和中文,直到 mb_strpos:

// TEST THAT DID NOT WORK
//$text=iconv('ISO-8859-1','utf-8',$text);//that's NOT a solution!
//$text=utf8_encode($text);//that's NOT a solution!

// Set vars
$len = 150;


// Next to code lines are OK for both English AND Chinese
// get substring based on $len, then get length of string. Both in multibyte
$text_cropped = mb_substr($text,0,$len,'UTF-8'); // works for English AND Chinese
$string_cropped_length = mb_strlen ($text_cropped,'UTF-8');

// mb_strpos only works for English, but not for Chinese
//find last space within length in multibyte
$last_space = mb_strpos ( $text , ' ', $string_cropped_length, 'UTF-8');

// Hack to only use $last_space if mb_strpos did work
// there is an error in mb_strpos based on php version. It may return empty for chinese chars

// a work around until php upgrade is test for value of last_spave and do an if-else
if(intval($last_space) > 0) {
    $text_cropped_final = mb_substr($text,0,$last_space,'UTF-8');
} else {
    $text_cropped_final = $text_cropped;
}

return $text_cropped_final . '...';
4

0 回答 0