我编写了以下代码来获得“分类准确度”与“阈值”的关系图:
(数据集的基本事实包含两个标记为“好”或“坏”的类)
LDAClassifierObject = ClassificationDiscriminant.fit(featureSelcted, groundTruthGroup, 'DiscrimType', 'linear');
[LDALabel, LDAScore] = resubPredict(LDAClassifierObject);
[~, AccuracyLDA, Thr] = perfcurve(groundTruthNumericalLable(:,1), LDAScore(:,1), 1,'yCrit','accu');
figure,
plot(Thr,AccuracyLDA,'r-');
hold on;
plot(Thr,AccuracyLDA,'bo');
xlabel('Threshold for ''good'' Returns');
ylabel('Classification Accuracy');
grid on;
[maxVal, maxInd] = max(AccuracyLDA)
maxVal =
0.8696
maxInd =
15
Thr(15)
ans =
0.7711
此外,我对相同数据集运行 ROC 分析,即基本事实包含两个标记为“好”或“坏”的类
[FPR, TPR, Thr, AUC, OPTROCPT] = perfcurve(groundTruthGroup(:,1), LDAScore(:,1), 'Good');
OPTROCPT =
0.1250 0.8667
为什么Thr(15)=0.7711
不同于OPTROCPT(2)=0.8667
?
ROC得到的最佳截止点(即最佳阈值OPTROCPT)是LDA精度最高的那个吗?
或者也许我错了,那么到底perfcurve(groundTruthNumericalLable(:,1), LDAScore(:,1), 1,'yCrit','accu')
告诉我们什么?