0

我正在使用caretR 包的随机森林模型进行监督分类。重要性分数是:

> imp
rf variable importance

  variables are sorted by average importance across the classes
  only 20 most important variables shown (out of 1072)

            S1     S2    S3
4431803 65.255 100.00 81.10
875118  98.548  83.17 76.34
DPH5298 76.253  64.65 65.52
L09963  73.734  55.34 62.27
L06919  68.265  36.08 67.35
L01951  29.271  44.96 65.14
SG01650 64.247  62.11 60.36
191797  62.054  51.16 56.09
L01455  21.829  49.09 59.42
DPH1252 47.619  59.38 36.41
SG00716 55.383  52.48 27.83
979261  37.371  54.99 29.40
543491  45.184  53.74 53.49
L00086  53.671  26.54 49.57
SG00379 35.353  23.06 53.66
4430843 52.587  53.65 47.06
L00680   4.569  46.35 53.49
L02770  26.357  42.34 52.95
995149  32.154  48.58 51.63
L00313  32.313   7.67 50.93

但是在我手动按平均重要性分数对特征进行排序后,结果就不同了:

> xx=rowMeans(imp$importance)
> head(sort(xx, decreasing=T), n=10)
  875118  4431803  DPH5298   L09963  SG01650   L06919   191797  4430843   543491  DPH1252
86.01983 82.11727 68.80837 63.78019 62.23624 57.23003 56.43629 51.09914 50.80446 47.80281

这是一个错误还是我错过了什么?

4

0 回答 0