我想训练我自己的自定义模型。我可以从哪里开始?
我正在使用此示例数据来训练模型:
<START:meaningless>Took connection and<END> selected the Text in the Letter Template and cleared the Formatting of Text to Normal.
基本上我想从给定的输入中识别出一些无意义的文本。
我尝试使用 opennlp 开发文档中给出的示例代码,但出现错误:模型与名称查找器不兼容!
Charset charset = Charset.forName("UTF-8");
ObjectStream<String> lineStream =
new PlainTextByLineStream(new FileInputStream("mynewmodel.train"), charset);
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model;
try {
model = NameFinderME.train("en", "meaningless", sampleStream,
Collections.<String, Object>emptyMap(), 100, 5);
}
finally {
sampleStream.close();
}
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}