这是我的看法。有点长,因为我包含了所有评论并解释了所有内容:
清理字符串是可选的,但强烈推荐。它将防止像“今天”与“今天”这样的故障。
<?php
$str = "Hello friend, Hello good good today!";
$count = array();
/*
* Remove all common special characters (so that "today" is equal to "today,"
* Also lowercase the entire string, so that "Hello" is equal to "hello".
*/
$str = preg_replace("/[.,!?:(){}\[\]@#$%\^&\*\-_]/", " ", $str);
$str = strtolower($str);
/*
* Split by spaces.
* The reason I'm using preg_split instead of explode is because there can be multiple spaces in succession
* And we don't want excess empty array elements.
*/
$words = preg_split("/\s+/", $str, -1, PREG_SPLIT_NO_EMPTY);
/*
* Iterate the words...
*/
foreach ($words as $word) {
/*
* If this is the first time we encounter the word...
*/
if (!isset($count[$word])) {
/*
* Set its count to one, and skip the rest of the loop
*/
$count[$word] = 1;
continue;
}
/*
* Increase the count of the word by one (won't be reached if first encounter
* Which means it would only happen if we already met the word.
*/
$count[$word]++;
}
/*
* Reverse sort with associative keys kept.
*/
arsort($count);
/*
* Show me the money!
*/
var_dump($count);
一个较短的版本,使用 PHP 的原生函数:
$str = "Hello friend, Hello good good today!";
//Import words into array
$words = str_word_count($str, 1);
//Count same values
$count = array_count_values($words);
//Ascending sort
arsort($count);
var_dump($count);