3

我创建了一个数组来获取文件,然后解析该文件的内容。我已经过滤掉了少于 4 个字符的单词if(strlen($value) < 4): unset($content[$key]); endif;

我的问题是——我想从数组中删除常用词,但其中有很多。我没有对每个数组值一遍又一遍地进行这些检查,而是想知道是否有更有效的方法来做到这一点?

这是我目前正在使用的代码示例。这个列表可能很大,我认为必须有更好(更有效)的方法?

foreach ($content as $key=>$value) {
    if(strlen($value) < 4): unset($content[$key]); endif; 
    if($value == 'that'): unset($content[$key]); endif;
    if($value == 'have'): unset($content[$key]); endif;
    if($value == 'with'): unset($content[$key]); endif;
    if($value == 'this'): unset($content[$key]); endif;
    if($value == 'your'): unset($content[$key]); endif;
    if($value == 'will'): unset($content[$key]); endif;
    if($value == 'they'): unset($content[$key]); endif;
    if($value == 'from'): unset($content[$key]); endif;
    if($value == 'when'): unset($content[$key]); endif;
    if($value == 'then'): unset($content[$key]); endif;
    if($value == 'than'): unset($content[$key]); endif;
    if($value == 'into'): unset($content[$key]); endif;
}
4

7 回答 7

2

也许这会更好:

$filter = array("that","have","with",...);

foreach ($content as $key=>$value) {
   if (in_array($value,$filter)){
      unset($content[$key])
   }
}
于 2012-08-04T23:01:58.563 回答
2

这是我的做法:

$exlcuded_words = array( 'that','have','with','this','your','will','they','from','when','then','than','into');
$replace = array_fill_keys($exlcuded_words,'');
echo str_replace(array_keys($replace),$replace,'some words that have to be with this your will they have from when then that into replaced');

它的工作方式:创建一个充满空字符串的数组,其中的键是您要删除/替换的子字符串。刚刚使用str_replace,将键作为第一个参数传递,数组本身作为第二个参数传递,在这种情况下,结果是:some words to be replaced. 此代码已经过测试并且工作正常。

处理数组时,只需用一些古怪的分隔符(比如%@%@%或其他东西)和str_replace批次将其内爆,再次爆炸批次,鲍勃就是你的叔叔


当谈到用少于 3 个字符替换所有单词时(我在原始答案中忘记了),这是正则表达式擅长的事情......我会说类似preg_replace('(\b|[^a-z])[a-z]{1,3}(\b|[^a-z])/i','$1$2',implode(',',$targetArray));或类似的话。
你可能想测试一下这个,因为这只是我的想法,而且未经测试。但这似乎足以让你开始

于 2012-08-04T23:13:29.360 回答
1

我可能会做这样的事情:

$aCommonWords = array('that','have','with','this','yours','etc.....');

foreach($content as $key => $value){
    if(in_array($value,$aCommonWords)){
        unset($content[$key]);
    }
}
于 2012-08-04T23:03:21.830 回答
1

创建一个要删除的单词数组并检查该值是否在该数组内

$exlcuded_words = array( 'that','have','with','this','your','will','they','from','when','then','than','into');

而如果foreach

if (in_array($value, $excluded_words)) unset($content[$key];
于 2012-08-04T23:04:06.503 回答
0

另一种可能的解决方案:

$arr = array_flip(array( 'that', 'have', 'with', 'this', 'your', 'will', 
        'they', 'from', 'when', 'then', 'than', 'into' ));
foreach ($content as $key=>$value) {
    if(strlen($value) < 4 || isset($arr[$value])) {
        unset($content[$key]);
    }
}
于 2012-08-04T23:13:05.483 回答
0

使用array_diff()

$content = array('here','are','some','words','that','will','be','filtered');
$filter = array('that','have','here','are','will','they','from','when','then');
$result = array_diff($content, $filter);

结果:

Array
(
    [2] => some
    [3] => words
    [6] => be
    [7] => filtered
)

或者,如果您希望在过滤内容方面具有更大的灵活性(例如,您提到需要过滤掉少于 4 个字符的单词),您可以使用array_filter()

$result = array_filter($content, function($v) use ($filter) {
    return !in_array($v, $filter) && strlen($v) >= 4;
});

结果:

Array
(
    [2] => some
    [3] => words
    [7] => filtered
)
于 2018-08-17T01:46:29.830 回答
0
$var = array('abb', 'bffb', 'cbbb', 'dddd', 'dddd', 'f', 'g');
$var= array_unique($var);
foreach($var as $val){
    echo $val. " ";
}

结果 :

abb
bffb
cbbb
dddd
f
g

最简单的方法

于 2018-08-17T01:58:33.027 回答