我有一个具有以下格式产品名称的 csv 文件,
产品评论
现在使用槌我必须训练分类器,以便如果将测试数据集作为包含产品评论的输入,它应该告诉我特定评论属于哪个产品
mallet java api帮助将不胜感激
我有一个具有以下格式产品名称的 csv 文件,
产品评论
现在使用槌我必须训练分类器,以便如果将测试数据集作为包含产品评论的输入,它应该告诉我特定评论属于哪个产品
mallet java api帮助将不胜感激
这是一个适合您的情况的小示例:
public static void main(String[] args) throws IOException {
//prepare instance transformation pipeline
ArrayList<Pipe> pipes = new ArrayList<Pipe>();
pipes.add(new Target2Label());
pipes.add(new CharSequence2TokenSequence());
pipes.add(new TokenSequence2FeatureSequence());
pipes.add(new FeatureSequence2FeatureVector());
SerialPipes pipe = new SerialPipes(pipes);
//prepare training instances
InstanceList trainingInstanceList = new InstanceList(pipe);
trainingInstanceList.addThruPipe(new CsvIterator(new FileReader("datasets/training.txt"), "(.*),(.*)", 2, 1, -1));
//prepare test instances
InstanceList testingInstanceList = new InstanceList(pipe);
testingInstanceList.addThruPipe(new CsvIterator(new FileReader("datasets/testing.txt"), "(.*),(.*)", 2, 1, -1));
ClassifierTrainer trainer = new NaiveBayesTrainer();
Classifier classifier = trainer.train(trainingInstanceList);
System.out.println("Accuracy: " + classifier.getAccuracy(testingInstanceList));
}