3

This is a question with respective to the output of Random Forest in R.

I understand what the gini, impurity, and mean accuracy plots represent. I have a large number of different response variables and have been computing MANY different random forests (separately on each course).

The resulting top predictors are usually fairly similar between the two output plots (mean accuracy and node purity). What confuses me is I have one output with a single variable with high node purity (followed by a huge break), but this same variable on the mean accuracy plot is VERY low. Almost at the bottom.

If I'm interpreting what I've read and what other answers have already been given on this forum correctly,

  • how can the same variable have high importance (node purity) but very low accuracy? This doesn't seem to make sense to me and makes me suspect of my results.

Any insight would be greatly appreciated!

4

0 回答 0