30

我正在尝试使用 javascript 的拆分从字符串中获取句子,但保留分隔符,例如!?。

到目前为止我有

sentences = text.split(/[\\.!?]/);

哪个有效,但不包括每个句子的结尾标点符号 (.!?)。

有谁知道这样做的方法?

4

5 回答 5

65

您需要使用匹配而不是拆分。

尝试这个。

var str = "I like turtles. Do you? Awesome! hahaha. lol!!! What's going on????";
var result = str.match( /[^\.!\?]+[\.!\?]+/g );

var expect = ["I like turtles.", " Do you?", " Awesome!", " hahaha.", " lol!!!", " What's going on????"];
console.log( result.join(" ") === expect.join(" ") )
console.log( result.length === 6);
于 2012-08-01T14:43:34.433 回答
11

以下是拉里答案的一个小补充,它也将匹配副句:

text.match(/\(?[^\.\?\!]+[\.!\?]\)?/g);

应用于:

text = "If he's restin', I'll wake him up! (Shouts at the cage.) 
'Ello, Mister Polly Parrot! (Owner hits the cage.) There, he moved!!!"

给予:

["If he's restin', I'll wake him up!", " (Shouts at the cage.)", 
" 'Ello, Mister Polly Parrot!", " (Owner hits the cage.)", " There, he moved!!!"]
于 2014-01-10T00:30:29.003 回答
6

Try this instead:-

sentences = text.split(/[\\.!\?]/);

? is a special char in regular expressions so need to be escaped.

Sorry I miss read your question - if you want to keep delimiters then you need to use match not split see this question

于 2012-08-01T14:38:38.057 回答
3

对 mircealungu 的回答略有改进:

string.match(/[^.?!]+[.!?]+[\])'"`’”]*/g);
  • 开头不需要左括号。
  • 句中包含诸如'...','!!!'等标点符号。'!?'
  • 包括任意数量的方括号和右括号。[编辑:添加了不同的右引号]
于 2019-04-07T04:38:36.087 回答
3

在这里改进 Mia 的答案是一个版本,其中还包括没有标点符号的结尾句子:

string.match(/[^.?!]+[.!?]+[\])'"`’”]*|.+/g)
于 2020-06-22T18:13:08.807 回答