3

我正在尝试逐个字符地分解字符串,但我遇到了特殊字符的麻烦。我目前正在使用以下功能:

<?php
$input = "Comment ça va?";
$array_input = str_split($input, 1);
print_r($array_input);
?>

这是输出:

Array (
[0] => C [1] => o [2] => m [3] => m [4] => e
[5] => n [6] => t [7] => [8] => � [9] => �
[10] => a [11] => [12] => v [13] => a [14] => ? )

我对换行符有同样的问题:

输入:
“Hé!
Oui?”

输出:

Array ( [0] => H [1] => � [2] => � [3] => ! [4] => 
[5] => [6] => O [7] => u [8] => i [9] => ? )

有人有这个问题的解决方案吗?非常感谢。

4

2 回答 2

3

str_splitUnicode 字符串有问题。

您可以改用u修饰符preg_split

例如:

$input = "Comment ça va?";
$letters1 = str_split($input);
$letters2 = preg_split('//u', $input, -1, PREG_SPLIT_NO_EMPTY);

print_r($letters1);
print_r($letters2);

将输出

Array ( [0] => C [1] => o [2] => m [3] => m [4] => e 
        [5] => n [6] => t [7] => [8] => � [9] => � 
        [10] => a [11] => [12] => v [13] => a [14] => ? ) 

Array ( [0] => C [1] => o [2] => m [3] => m [4] => e 
        [5] => n [6] => t [7] => [8] => ç [9] => a 
        [10] => [11] => v [12] => a [13] => ? ) 
于 2012-04-29T14:46:20.503 回答
2

这是因为 PHP 的str_split函数不是多字节安全的,即它不能正确处理 Unicode。您可以改用这个函数,它是一个多字节安全的实现str_split

function mb_str_split( $string ) { 
    # Split at all position not after the start: ^ 
    # and not before the end: $ 
    return preg_split('/(?<!^)(?!$)/u', $string ); 
} 

(来源:PHP 文档中的用户评论)

于 2012-04-29T14:48:09.490 回答