我在 Java 代码中的 Weka 中生成决策树,如下所示:
J48 j48DecisionTree = new J48();
Instances data = null;
data = new Instances(new BufferedReader(new FileReader(dt.getArffFile())));
data.setClassIndex(data.numAttributes() - 1);
j48DecisionTree.buildClassifier(data);
我可以将 Weka 结果缓冲区的结果保存到程序中的文本文件中,以便在运行时将以下内容保存到文本文件中:
=== 分层交叉验证 === === 总结 ===
Correctly Classified Instances 229 40.1754 %
Incorrectly Classified Instances 341 59.8246 %
Kappa statistic 0.2022
Mean absolute error 0.1916
Root mean squared error 0.3138
Relative absolute error 80.8346 %
Root relative squared error 91.1615 %
Coverage of cases (0.95 level) 96.3158 %
Mean rel. region size (0.95 level) 70.9774 %
Total Number of Instances 570
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.44 0.012 0.786 0.44 0.564 0.76 Business and finance and economics
0 0 0 0 0 0.616 Fashion and celebrity lifestyle
0.125 0.01 0.667 0.125 0.211 0.663 Film
0 0.002 0 0 0 0.617 Music
0.931 0.78 0.318 0.931 0.474 0.633 News and current affairs
0.11 0.006 0.786 0.11 0.193 0.653 Science and nature and technology
0.74 0.012 0.86 0.74 0.796 0.85 Sport
加权平均 0.402 0.224 0.465 0.402 0.316 0.667
=== Confusion Matrix ===
a b c d e f g <-- classified as
22 0 0 0 25 2 1 | a = Business and finance and economics
0 0 1 0 59 0 0 | b = Fashion and celebrity lifestyle
0 0 10 1 69 0 0 | c = Film
0 0 1 0 69 0 0 | d = Music
5 0 2 0 149 0 4 | e = News and current affairs
1 0 0 0 87 11 1 | f = Science and nature and technology
0 0 1 0 11 1 37 | g = Sport
dt 是我的一个类的一个实例,用来表示决策树的细节。
当我运行大量分类器时,这会有所帮助。