3

我有一个具有以下格式产品名称的 csv 文件,
产品评论

现在使用槌我必须训练分类器,以便如果将测试数据集作为包含产品评论的输入,它应该告诉我特定评论属于哪个产品

mallet java api帮助将不胜感激

4

1 回答 1

8

这是一个适合您的情况的小示例:

    public static void main(String[] args) throws IOException {
        //prepare instance transformation pipeline
        ArrayList<Pipe> pipes = new ArrayList<Pipe>();
        pipes.add(new Target2Label());
        pipes.add(new CharSequence2TokenSequence());
        pipes.add(new TokenSequence2FeatureSequence());
        pipes.add(new FeatureSequence2FeatureVector());
        SerialPipes pipe = new SerialPipes(pipes);

        //prepare training instances
        InstanceList trainingInstanceList = new InstanceList(pipe);
        trainingInstanceList.addThruPipe(new CsvIterator(new FileReader("datasets/training.txt"), "(.*),(.*)", 2, 1, -1));

        //prepare test instances
        InstanceList testingInstanceList = new InstanceList(pipe);        
        testingInstanceList.addThruPipe(new CsvIterator(new FileReader("datasets/testing.txt"), "(.*),(.*)", 2, 1, -1));

        ClassifierTrainer trainer = new NaiveBayesTrainer();
        Classifier classifier = trainer.train(trainingInstanceList);
        System.out.println("Accuracy: " + classifier.getAccuracy(testingInstanceList));
   }
于 2012-09-26T14:41:23.500 回答