0

我正在尝试使用 nltk-trainer ( https://github.com/japerk/nltk-trainer )。我设法使用命令(直接在 Anaconda 控制台中)训练了荷兰语标注器和分块器:

python train_tagger.py conll2002 --fileids ned.train --classifier IIS --filename ~/nltk_data/taggers/conll2002_ned_IIS.pickle
python train_chunker.py conll2002 --fileids ned.train --classifier NaiveBayes --filename ~/nltk_data/chunkers/conll2002_ned_NaiveBayes.pickle

然后我运行一个小脚本来测试标记器和分块器:

import nltk
from nltk.corpus import conll2002

# Loading training pickles
tokenizer = nltk.data.load('tokenizers/punkt/dutch.pickle')
tagger = nltk.data.load('taggers/conll2002_ned_IIS.pickle')
chunker = nltk.data.load('chunkers/conll2002_ned_NaiveBayes.pickle')

# Testing
test_sents = conll2002.tagged_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(tagger.evaluate(test_sents))

test_sents = conll2002.chunked_sents(fileids="ned.testb")[0:1000]
print "tagger accuracy on test-set: " + str(chunker.evaluate(test_sents))

这在 nltk-trainer-master 文件夹中运行良好,但是当我将脚本移动到其他位置时,我收到导入错误:

ImportError: No module named nltk_trainer.chunking.chunkers

如何在不复制 nltk_trainer 文件夹的情况下在 nltk-trainer-master 文件夹之外进行这项工作?

(Python 2.7,nltk 3.2.1)

4

0 回答 0