我尝试在我的 java 程序中使用 weka 库进行文本分类,但我有一点问题
这是我的训练数据,有 5 个数据和两个类:
@relation hamspam
@attribute text string
@attribute class {ham,spam}
@data
'good',ham
'very good',ham
'bad',spam
'very bad',spam
'very bad, very bad',spam
这是我的测试数据,共有三个数据:
@relation hamspam
@attribute text string
@attribute class {ham,spam}
@data
'good bad very bad',?
'good good good',?
'good very good',?
这是我的代码:
public static void loader() throws FileNotFoundException, IOException, Exception{
//filter
StringToWordVector filter = new StringToWordVector();
Classifier j48tree = new J48();
//training data
Instances train = new Instances(new BufferedReader(new FileReader("D:/trainingdata.arff")));
int lastIndex = train.numAttributes() - 1;
train.setClassIndex(lastIndex);
filter.setInputFormat(train);
train = Filter.useFilter(train, filter);
//testing data
Instances test = new Instances(new BufferedReader(new FileReader("D:/testingdata.arff")));
test.setClassIndex(lastIndex);
filter.setInputFormat(test);
test = Filter.useFilter(test, filter);
j48tree.buildClassifier(train);
for(int i=0; i<test.numInstances(); i++) {
double index = j48tree.classifyInstance(test.instance(i));
String className = train.attribute(lastIndex).value((int)index);
System.out.println(className);
}
}
我尝试预测类名并打印它,但类名没有出现。我的代码有什么问题?