0

试图从一个数组中获取两个单词短语,但我一直在单词之前或之后得到一个带有空格的单词短语。

$text = preg_replace(array('/\s{2,}/', '/[\t\n]/'), ' ', $text);
$textarray = explode(" ", $text);
array_filter($textarray);
$textCount = array_count_values($textarray);
arsort($textCount);
$twoWords = array();
for ($i = 0; $i<=count($textarray); ++$i) {
    if ($textarray[$i]<>"  ") {
        $twoWords[$i] = $textarray[$i]." ".$textarray[$i + 1];
    } else {
        $twoWords[$i] = "blah";
    }
}
foreach ($twoWordsCount as $tey => $val2) {
    if ($tey == null || $tey == "" || $tey == "\r\n"  || $tey == "\t"  || $tey == "\r"  || $tey == "\n" || $tey == "&nbsp;" || $tey == "  " || $tey == " ") {
        //do nothing
    } else {
        echo $val2."\n";
    }
}

并且由于某种原因,这只是返回值,例如空格 Hello 或 Test 然后是空格,但我希望它返回 hello test

4

2 回答 2

2

不知道脚本的后半部分应该做什么,但第一部分可以简化为一个 preg_split() 代码行

<?php
foreach( array('hello world', ' hello world', 'hello world ', '  hello     world     ') as $input ) {
    $w = preg_split('!\s+!', $input, -1, PREG_SPLIT_NO_EMPTY);
    var_dump($w);
}

印刷

array(2) {
  [0]=>
  string(5) "hello"
  [1]=>
  string(5) "world"
}
array(2) {
  [0]=>
  string(5) "hello"
  [1]=>
  string(5) "world"
}
array(2) {
  [0]=>
  string(5) "hello"
  [1]=>
  string(5) "world"
}
array(2) {
  [0]=>
  string(5) "hello"
  [1]=>
  string(5) "world"
}

编辑:也许你正在寻找这样的东西

<?php
$input = getData();
$w = preg_split('![\s[:punct:]]+!', $input, -1, PREG_SPLIT_NO_EMPTY);
$w = array_count_values($w);
arsort($w);
$ci = new CachingIterator( new ArrayIterator( $w ) );
foreach( $ci as $next=>$cnt ) {
    printf("%s(%d) %s(%d)\n",
        $next, $cnt,
        $ci->getInnerIterator()->key(), $ci->getInnerIterator()->current()
    );
}


function getData() {
    return <<< eot
Mary had a little lamb,
whose fleece was white as snow.

And everywhere that Mary went,
the lamb was sure to go.

It followed her to school one day
which was against the rules.

It made the children laugh and play,
to see a lamb at school.

And so the teacher turned it out,
but still it lingered near,

And waited patiently about,
till Mary did appear.

"Why does the lamb love Mary so?"
the eager children cry.

"Why, Mary loves the lamb, you know."
 the teacher did reply.
eot;
}

哪个打印

the(8) lamb(5)
lamb(5) Mary(5)
Mary(5) was(3)
was(3) And(3)
And(3) to(3)
to(3) It(2)
It(2) school(2)
school(2) so(2)
so(2) Why(2)
Why(2) did(2)
did(2) it(2)
it(2) teacher(2)
teacher(2) children(2)
children(2) a(2)
a(2) waited(1)
waited(1) patiently(1)
patiently(1) about(1)
about(1) till(1)
[...]
white(1) went(1)
went(1) (0)

http://docs.php.net/class.cachingiterator

于 2012-04-20T20:30:09.140 回答
0

这将提取前两个单词,无论有多少空格、制表符或换行符

$text = trim(preg_replace('/[\s\t\r\n]+/', ' ', $text));
$firstTwoWords = substr($text, 0, strpos($text, ' ', strpos($text, ' ') + 1));
于 2012-04-20T20:39:57.577 回答