0

我有一个包含多行用逗号分隔的中文单词的文件,,如下所示:

你,我,他,好,但,中,国,龙
好,把,是,的,啊,人,吖,哦

我想使用以下代码将它们加载到数组中,稍后我可以使用该数组来查找文章中包含的中文单词:

$ds = file($Dictionary);
$_SP_ = chr(0xFF).chr(0xFE);
$array = array();
foreach($ds as $d)
{
    $spstr = _SP_;//
    $spstr = iconv(ucs-2be, 'utf-8', $spstr);
    $ws = explode(',', $d);//array of single Chinese word
    $wall = iconv('utf-8', ucs-2be, join($spstr, $ws));//what is $wall used for?
    $ws = explode(_SP_, $wall);
    foreach($ws as $estr)
    {
        $array[$estr] = strlen($estr);
    }
}

我的问题:

  1. 什么$_SP_ = chr(0xFF).chr(0xFE) mean?chr(0xFF).chr(0xFE)是从 ASCII 的最后两个字符中检索的字符串,这两者的组合是为了什么?

  2. 为什么我应该将ucs-2b的SP转换为 utf-8 格式?

  3. 为什么$ws再次被转换为字符串但由chr(0xFF).chr(0xFE)utf-8 类型分隔。

  4. 为什么它需要每个单词的长度?

  5. 为什么$spstr是UCS-2be类型,只因为它是的组合chr(0xFF).chr(0xFE)

4

0 回答 0