-1

这是我从 R 中的confusionMatrix() 函数得到的结果,它基于零-R 模型。我可能错误地设置了函数,根据它的结果,我手动得到的结果不匹配,因为答案因随机种子而异,而confusionMatrix()函数的灵敏度答案仅为1.0000:

> sensitivity1 = 213/(213+128)
> sensitivity2 = 211/(211+130)
> sensitivity3 = 215/(215+126)
> #specificity = 0/(0+0) there were no other predictions
> specificity = 0
> specificity
[1] 0
> sensitivity1
[1] 0.6246334
> sensitivity2
[1] 0.6187683
> sensitivity3
[1] 0.6304985

有一条警告消息,但它看起来仍然运行并重构数据以匹配,因为它的顺序不同,这可能基于训练和测试排序和随机化。我试图返回并确保火车和测试没有带有负号或不同行数的反向排序。这是插入符号的confusionMatrix() 函数的结果:

> confusionMatrix(as.factor(testDiagnosisPred), as.factor(testDiagnosis), positive="B") 
Confusion Matrix and Statistics

          Reference
Prediction   B   M
         B 211 130
         M   0   0
                                          
               Accuracy : 0.6188          
                 95% CI : (0.5649, 0.6706)
    No Information Rate : 0.6188          
    P-Value [Acc > NIR] : 0.524           
                                          
                  Kappa : 0               
                                          
 Mcnemar's Test P-Value : <2e-16          
                                          
            Sensitivity : 1.0000          
            Specificity : 0.0000          
         Pos Pred Value : 0.6188          
         Neg Pred Value :    NaN          
             Prevalence : 0.6188          
         Detection Rate : 0.6188          
   Detection Prevalence : 1.0000          
      Balanced Accuracy : 0.5000          
                                          
       'Positive' Class : B               
                                          
Warning message:
In confusionMatrix.default(as.factor(testDiagnosisPred), as.factor(testDiagnosis),  :
  Levels are not in the same order for reference and data. Refactoring data to match.

testDiagnosisPred 只是显示它猜测良性 (B) 作为数据集中每个癌症测试的诊断,这些因种子而异,因为实际的良性 (B) 和恶性 (M) 结果每次都是随机的。

testDiagnosisPred
  B 
341 
> ## testDiagnosisPred
> ##   B 
> ## 228
> 
> majorityClass # confusion matrix

  B   M 
211 130 
> ## 
> ##   B   M 
> ## 213 128
> 
> # another seed's confusion matrix
> ## B   M 
> ## 211 130 

下面是一些使用 head() 和 str() 函数的数据:

> head(testDiagnosisPred)
[1] "B" "B" "B" "B" "B" "B"
> head(cancerdata.train$Diagnosis)
[1] "B" "B" "M" "M" "M" "B"
> head(testDiagnosis)
[1] "B" "B" "M" "M" "M" "B"
> 
> str(testDiagnosisPred)
 chr [1:341] "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B" ...
> str(cancerdata.train$Diagnosis)
 chr [1:341] "B" "B" "M" "M" "M" "B" "B" "B" "M" "M" "M" "B" "M" "M" "B" "M" "B" "B" "B" "M" "B" "B" "B" "B" ...
> str(testDiagnosis)
 chr [1:341] "B" "B" "M" "M" "M" "B" "B" "B" "M" "M" "M" "B" "M" "M" "B" "M" "B" "B" "B" "M" "B" "B" "B" "B" ...
> 
4

1 回答 1

0

混淆矩阵的混淆以及特异性和敏感性的计算是因为误读了混淆矩阵而不是垂直方向,正确答案来自插入符号中的confusionMatrix()函数,另一种认识方式是它是ZeroR模型和经过进一步调查,它总是只有 1.00 的敏感性和 0.00 的特异性!那是因为 ZeroR 模型使用零规则和零属性,只是给出了多数预测。

> confusionMatrix(as.factor(testDiagnosisPred), as.factor(testDiagnosis), positive="B") 
Confusion Matrix and Statistics

          Reference
Prediction   B   M
         B 211 130
         M   0   0
                                          
               Accuracy : 0.6188                  
                                          
            Sensitivity : 1.0000          
            Specificity : 0.0000 

当我进行这些手动特异性和敏感性计算时,我误读了水平而不是垂直的混淆矩阵:

> sensitivity1 = 213/(213+128)
> sensitivity2 = 211/(211+130)
> sensitivity3 = 215/(215+126)
> #specificity = 0/(0+0) there were no other predictions
> specificity = 0
> specificity
[1] 0
> sensitivity1
[1] 0.6246334
> sensitivity2
[1] 0.6187683
> sensitivity3
[1] 0.6304985
于 2021-09-21T02:33:00.403 回答