1

我尝试在我的 java 程序中使用 weka 库进行文本分类,但我有一点问题

这是我的训练数据,有 5 个数据和两个类:

@relation hamspam

@attribute text string
@attribute class {ham,spam}

@data
'good',ham
'very good',ham
'bad',spam
'very bad',spam
'very bad, very bad',spam

这是我的测试数据,共有三个数据:

@relation hamspam

@attribute text string
@attribute class {ham,spam}

@data
'good bad very bad',?
'good good good',?
'good very good',?

这是我的代码:

public static void loader() throws FileNotFoundException, IOException, Exception{
        //filter
        StringToWordVector filter = new StringToWordVector();

        Classifier j48tree = new J48();

        //training data
        Instances train = new Instances(new BufferedReader(new FileReader("D:/trainingdata.arff")));
        int lastIndex = train.numAttributes() - 1;
        train.setClassIndex(lastIndex);
        filter.setInputFormat(train);
        train = Filter.useFilter(train, filter);

        //testing data
        Instances test = new Instances(new BufferedReader(new FileReader("D:/testingdata.arff")));
        test.setClassIndex(lastIndex);
        filter.setInputFormat(test);
        test = Filter.useFilter(test, filter);

        j48tree.buildClassifier(train);

        for(int i=0; i<test.numInstances(); i++) {
            double index = j48tree.classifyInstance(test.instance(i));
            String className = train.attribute(lastIndex).value((int)index);
            System.out.println(className);
        }
    }

我尝试预测类名并打印它,但类名没有出现。我的代码有什么问题?

4

0 回答 0