0

我正在寻求有关编写脚本以检查短语/单词列表并将它们相互比较并查看哪个是正确键入的短语/单词的帮助。

$arr1 = array('fbook', 'yahoo msngr', 'text me later', 'how r u');  
$arr2 = array('facebook', 'yahoo messenger', 'txt me l8r', 'how are you');

因此,在比较每个数组中的每个索引时,它应该遍历每个数组并比较两个值。最后,它应该产生:

facebook
yahoo messenger
text me later
how are you

任何帮助,我很感激!

4

5 回答 5

1

There's no way to "guess" which is the correct way, you must have a knowledge base (i.e.: a dictionary).

This dictionary can be implemented using pspell (aspell) as @Dominic mentioned, or you can have your own array as a dictionary.

If you have an array as dictionary, you can use the Levenshtein algorithm, that is available as a function in php to calculate the distance between two words (i.e.: your word and the reference one). So you can iterate over the reference array to find the word(s) that have the smallest difference from the one you're looking for, and those might be the best options to suggest as a correction. If the distance is 0, so the word that is being checked is already correct.

于 2009-11-05T00:49:34.617 回答
1

如果您的输入相当简单并且安装了pspell,并且数组大小相同:

对于两个数组中的每个索引,您可以explode将空格、pspell_check每个单词和pspell_check返回 true 的单词百分比最高的短语作为要保留的短语。

帮助您入门的示例代码:

function percentage_of_good_words($phrase) {
  $words = explode(" ", $phrase);
  $num_good = 0;
  $num_total = count($words);

  if ($num_total == 0) return 0;

  for ($words as $word) {
    if (pspell_check($word)) {
      $num_good++;
    }
  }

  return ($num_good / $num_total) * 100;
}

$length = count($arr1);
$kept = array();
for ($i = 0; i < $length; $i++) {
   $percent_from_arr1 = percentage_of_good_words($arr1[$i]);
   $percent_from_arr2 = percentage_of_good_words($arr2[$i]);
   $kept[$i] = $percent_from_arr1 > $percent_from_arr2 ? $arr1[$i] : $arr2[$i];
}
于 2009-11-04T22:50:06.113 回答
0

您需要在处理这些单词时定义一些规则。以您的示例为例,您需要一个正则表达式,并且您希望关键字的长度更长,但在某些情况下,更长的长度可能不起作用。

于 2009-11-04T22:49:51.790 回答
0

如果您有一个您知道是正确的数组,那么执行以下操作将非常容易:

foreach ($correct_array as $word => $num){
    if ($word == $tested_array[$num])
        {echo "this is correct: " . $word . "<br />";}
    else{
        echo "this is incorrectly spelled: " . $tested_array[$num] . "<br />";
    }

}
于 2009-11-04T22:50:45.753 回答
0

如果您需要做的就是确保拼写正确,您可以使用in_array,如下所示:

foreach ($arr2 as $val){
   if(in_array($val,$arr1){
     //spelled properly
   }
   else{
     //spelled incorrectly
   }

}

如果你想真正地自动更正它们,可能需要一个非常复杂的算法,并将所有可能的拼写错误存储在某个数据库中。

于 2009-11-04T22:52:24.580 回答