5

我已经为分类问题实现了逻辑回归。我在精度、召回率和 F1 分数方面得到了相同的值。是否可以具有相同的值?我在实现决策树和随机森林时也遇到了这个问题。在精度、召回率和 F1 分数方面,我也得到了相同的值。

// Run training algorithm to build the model.
        final LogisticRegressionModel model = new LogisticRegressionWithLBFGS()
                .setNumClasses(13).
                run(data.rdd());
//Compute raw scores on the test set.
        JavaRDD<Tuple2<Object, Object>> predictionAndLabels = testData.map(
                new Function<LabeledPoint, Tuple2<Object, Object>>() {
                    public Tuple2<Object, Object> call(LabeledPoint p) {
                        Double prediction = model.predict(p.features());
                        return new Tuple2<Object, Object>(prediction, p.label());
                    }
                }
        );
// Get evaluation metrics.
        MulticlassMetrics metrics = new MulticlassMetrics(predictionAndLabels.rdd());
        double precision = metrics.precision();
        System.out.println("Precision = " + precision);

        double recall = metrics.recall();
        System.out.println("Recall = " + recall);

        double FScore = metrics.fMeasure();
        System.out.println("F Measure = " + FScore);
4

2 回答 2

6

我也面临同样的问题。我尝试过决策树、随机森林和 GBT。每次,我都得到相同的精度、召回率和 F1 分数。精度也是一样的(通过混淆矩阵计算)。

因此,我使用自己的公式和编写的代码来获得准确度、精确度、召回率和 F1 分数度量。

from pyspark.ml.classification import RandomForestClassifier
from pyspark.mllib.evaluation import MulticlassMetrics

#generate model on splited dataset
rf = RandomForestClassifier(labelCol='label', featuresCol='features')
fit = rf.fit(trainingData)
transformed = fit.transform(testData)

results = transformed.select(['prediction', 'label'])
predictionAndLabels=results.rdd
metrics = MulticlassMetrics(predictionAndLabels)

cm=metrics.confusionMatrix().toArray()
accuracy=(cm[0][0]+cm[1][1])/cm.sum()
precision=(cm[0][0])/(cm[0][0]+cm[1][0])
recall=(cm[0][0])/(cm[0][0]+cm[0][1])`
print("RandomForestClassifier: accuracy,precision,recall",accuracy,precision,recall)
于 2017-11-21T13:30:33.277 回答
2

您可以将 label=1 作为二元分类的精度和召回方法的参数。它对我有用。对于多重分类,您可以尝试计算精度和召回值的类的标签索引。

`double precision = metrics.precision(label=1);
 System.out.println("Precision = " + precision);
 double recall = metrics.recall(label=1);
 System.out.println("Recall = " + recall);
 double FScore = metrics.fMeasure();
 System.out.println("F Measure = " + FScore);`
于 2019-02-26T07:58:17.940 回答