8

我在计算 matlab 中分类器的精度和召回率时遇到问题。我使用 FisherIris 数据(由 150 个数据点、50-setosa、50-versicolor、50-virginica 组成)。我已经使用 kNN 算法进行了分类。这是我的混淆矩阵:

50     0     0
 0    48     2
 0     4    46

正确分类率是96%(144/150),但是怎么用matlab计算precision和recall,有什么功能吗?我知道精度=tp/(tp+fp) 和召回率=tp/(tp+fn) 的公式,但我在识别组件时迷失了方向。例如,我可以说真正的正数是矩阵中的 144 吗?假阳性和假阴性呢?请帮忙!!!我真的很感激!谢谢!

4

4 回答 4

8

为了补充 pederpansen 的答案,这里有一些匿名的 Matlab 函数,用于计算每个类的精度、召回率和 F1 分数,以及所有类的平均 F1 分数:

precision = @(confusionMat) diag(confusionMat)./sum(confusionMat,2);

recall = @(confusionMat) diag(confusionMat)./sum(confusionMat,1)';

f1Scores = @(confusionMat) 2*(precision(confusionMat).*recall(confusionMat))./(precision(confusionMat)+recall(confusionMat))

meanF1 = @(confusionMat) mean(f1Scores(confusionMat))
于 2016-03-07T11:48:40.450 回答
3

As Dan pointed out in his comment, precision and recall are usually defined for binary classification problems only.

But you can calculate precision and recall separately for each class. Let's annotate your confusion matrix a little bit:

          |                  true           |
          |      |  seto  |  vers  |  virg  |
          -----------------------------------
          | seto |   50        0        0
predicted | vers |    0       48        2
          | virg |    0        4       46

Here I assumed the usual convention holds, i.e. columns are used for true values and rows for values predicted by your learning algorithm. (If your matrix was built the other way round, simply take the transpose of the confusion matrix.)

The true positives (tp(i)) for each class (=row/column index) i is given by the diagonal element in that row/column. The true negatives (tn) then are given by the sum of the remaining diagonal elements. Note that we simply define the negatives for each class i as "not class i".

If we define false positives (fp) and false negatives (fn) analogously as the sum of off-diagonal entries in a given row or column, respectively, we can calculate precision and recall for each class:

precision(seto) = tp(seto) / (tp(seto) + fp(seto)) = 50 / (50 + (0 + 0)) = 1.0
precision(vers) = 48 / (48 + (0 + 2)) = 0.96
precision(virg) = 46 / (46 + (0 + 4)) = 0.92

recall(seto) = tp(seto) / (tp(seto) + fn(seto)) = 50 / (50 + (0 + 0)) = 1.0
recall(vers) = 48 / (48 + (0 + 4)) = 0.9231
recall(virg) = 46 / (46 + (0 + 2)) = 0.9583

Here I used the class names instead of the row indices for illustration.

Please have a look at the answers to this question for further information on performance measures in the case of multi-class classification problems - particularly if you want to end up with single number instead of one number for each class. Of course, the easiest way to do this is just averaging the values for each class.

Update

I realized that you were actually looking for a Matlab function to do this. I don't think there is any built-in function, and on the Matlab File Exchange I only found a function for binary classification problems. However, the task is so easy you can easily define your own functions like so:

function y = precision(M)
  y = diag(M) ./ sum(M,2);
end

function y = recall(M)
  y = diag(M) ./ sum(M,1)';
end

This will return a column vector containing the precision and recall values for each class, respectively. Now you can simply call

>> mean(precision(M))

ans =

    0.9600

>> mean(recall(M))

ans =

    0.9605

to obtain the average precision and recall values of your model.

于 2015-03-12T14:42:43.297 回答
1

使用以下 matab 代码

   actual = ...
   predicted= ...
   cm = confusionmat(actual,predicted);
   cm = cm';
   precision = diag(cm)./sum(cm,2);
   overall_precision = mean(precision)
   recall= diag(cm)./sum(cm,1)';
   overall_recall = mean(recall)
于 2019-04-16T10:33:57.193 回答
0

另一种方法

   confMat=[50,0,0;0,48,2;0,4,46];

for i =1:size(confMat,1)
    precision(i)=confMat(i,i)/sum(confMat(i,:)); 
end
precision(isnan(precision))=[];
Precision=sum(precision)/size(confMat,1);

for i =1:size(confMat,1)
    recall(i)=confMat(i,i)/sum(confMat(:,i));  
end

Recall=sum(recall)/size(confMat,1);

F_score=2*Recall*Precision/(Precision+Recall);
于 2018-11-26T12:51:36.843 回答