php - 根据字数将句子分成段落

Question

我想将一个句子分成一个段落，每个段落的单词数应该少于。例如：

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. 

Paragraph 1: 
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.

Paragraph 2: 
Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.

在上面的例子中，少于 20 的单词在第 1 段中，其余的在第 2 段中。

有没有办法使用 php 来实现这一点？

我尝试过$abc = explode(' ', $str, 20);将 20 个单词存储在一个数组中，然后将其余单词存储到最后一个数组 $abc['21']。如何从前 20 个数组中提取数据作为第一段，然后将其余的作为第二段？

score 1 · Accepted Answer

首先将字符串拆分为句子。然后循环遍历句子数组，首先将句子添加到段落数组中，然后计算段落数组的该元素中的单词，如果大于 19，则增加段落计数器。

$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.';

$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string);

$ii = 0;
$paragraphs = array();
foreach ( $sentences as $value ) {
    if ( isset($paragraphs[$ii]) ) { $paragraphs[$ii] .= $value; }
    else { $paragraphs[$ii] = $value; }
    if ( 19 < str_word_count($paragraphs[$ii]) ) {
        $ii++;
    }
}
print_r($paragraphs);

输出：

Array
(
    [0] => Contrary to popular belief, Lorem Ipsum is not simply random text.It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.
    [1] => Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.
)

在此处找到句子拆分器：使用 regexp 和 PHP 将段落拆分为句子

php - 根据字数将句子分成段落

1 回答 1

Related

Reference