如果你想使用斯坦福解析器,使用这个:
import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/folder/with/standford/jars'
os.environ['STANFORD_MODELS'] = '/folder/with/standford/jars'
parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
print parser.raw_batch_parse(("Hello, My name is Melroy.", "What is your name?"))
输出:
[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']),树('NP',[树('PRP$',['My']),树('NN',['name'])]),树('VP',[树('VBZ',[ 'is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree(' ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is' ]), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['? '])])])]
注意 1:
在这个例子中,解析器和模型 jar 都在同一个文件夹中。
笔记2:
- stanford解析器的文件名是:stanford-parser.jar
- stanford 模型的文件名是:stanford-parser-xxx-models.jar
注 3:
englishPCFG.ser.gz 文件可以在 models.jar 文件中找到(/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz)。请使用来存档管理器“解压缩”models.jar 文件。