2

我对 GATE 中的 Sentence Splitter 模块有疑问。我的文字是这样的:

Social history. He drank a lot in his young age. He did
not attend a school. He was depressed of his condition.

虽然我们确信句子应该像

Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did not attend a school.
Sentence 4: He was depressed of his condition.

ANNIE Sentence Splitter 识别出不同行中的文本应该被分组到不同的句子中,因此结果如下:

Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did 
Sentence 4: not attend a school.
Sentence 5: He was depressed of his condition.

那是因为句子被分成多行。有没有办法告诉句子拆分器该句子可能不止一行?或者有没有更好的方法来识别这种类型的文本中的句子?

谢谢 :)

4

1 回答 1

6

尝试使用 RegEx Sentence Splitter 而不是 Annie。

使用 ANNIE Sentence Splitter,您有参数 TransducerURL,默认情况下指向如下内容:

/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main-single-nl.jape

在此文件夹中还有一个名为 jape 的文件:

/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main.jape

如果你改变它应该可以工作。

于 2016-08-12T08:14:50.323 回答