r - 使用训练数据进行命名实体识别

翻译自：https://stackoverflow.com/questions/20073713 2013-11-19T14:07:53.657

3003 次

我的文本文件 t1.txt 包含这个

<START:name> Ashish Sanadhya <END> , 61 years old , will join the board as a nonexecutive director Nov. 29 .
Mr . <START:name> mayank sharma <END> is chairman of Elsevier N.V. , the Dutch publishing group .

和 t2.txt 包含

person mayank sharma
persons ashish sanadhya
organizations linkedin

我已经训练了图像显示的数据，在此处输入图像描述但是当试图返回所需的结果时，如

>s <- paste(c("I am ashish ."))
> a2 <- annotate(s, list(sent_token_annotator, word_token_annotator))
> entity_annotator <-  Maxent_Entity_Annotator(language = "en", kind = c("person"), probs = FALSE,model ="C:\\apache-opennlp-1.5.3\\en-ner-person.bin")
>  entity_annotator(s, a2)
  [1] id    type  start end  
<0 rows> (or 0-length row.names)

我在训练人员实体后期待结果

 entity_annotator(s, a2)
 id type   start end features   
 1  entity 6    11  kind=person
 s[entity_annotator(s, a2)]
 ashish

任何帮助，为什么我没有得到预期的结果。谢谢，在这个方向上的任何帮助

已编辑

我已经从这里下载了文件en-ner-person.bin并且截止参数对我有用，我使用了这个命令

c:\apache-opennlp-1.5.3>bin\opennlp TokenNameFinderTrainer -cutoff 1 -lang en -encoding UTF-8 -data "c:\t7.txt" -model en-ner-person.bin

希望它有所帮助，特别感谢 Daniel Naber。

r - 使用训练数据进行命名实体识别

0 回答 0

Related

Reference