我尝试使用斯坦福解析器解析一个句子,但我得到了异常。下面指定了输入文件、代码和异常。
我认为问题是因为输入文件中的 penn 树不处理标点符号。如何生成一个也处理标点符号的 penn 树?
输入文件
(ROOT
(S
(NP (DT A) (NN doctor) (NN investigation) (NN system) (NN (DIS)))
(VP (VBZ is)
(NP
(NP (DT a) (NN part))
(PP (IN of)
(NP (DT a) (NN hospital) (NN information) (NN system) (NN (HIS).)))))))
代码
String str="-collapsed -treeFile temp.txt";
String ar[]=str.split(" ");
edu.stanford.nlp.trees.EnglishGrammaticalStructure.main(ar);
try {
FileOutputStream fw = new FileOutputStream("k.txt");
PrintStream out = new PrintStream(fw);
System.setOut(out);
} catch (Exception e) {
System.out.print(e);
}
引发异常:
Head is null: NN-37
Exception in thread "main" java.lang.IllegalArgumentException: governor or dependent cannot be null
at edu.stanford.nlp.trees.UnnamedDependency.<init>(UnnamedDependency.java:105)
at edu.stanford.nlp.trees.TreeGraphNode.dependencies(TreeGraphNode.java:519)
at edu.stanford.nlp.trees.Tree.dependencies(Tree.java:1090)
at edu.stanford.nlp.trees.GrammaticalStructure.<init>(GrammaticalStructure.java:71)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.<init>(EnglishGrammaticalStructure.java:115)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.<init>(EnglishGrammaticalStructure.java:89)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.<init>(EnglishGrammaticalStructure.java:61)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.<init>(EnglishGrammaticalStructure.java:53)