r - R - MLR - randomForestSRC - 模型规模巨大，预测时间非常慢 - 如何减少两者？

Question

使用 MLR训练了分类 randomForestSRC ( https://www.rdocumentation.org/packages/randomForestSRC/versions/2.6.0 )，模型大小为许多 GB，每个实例的预测时间非常慢。

我们可以从模型中去除什么来减小尺寸，并且可能也减少预测时间？

请注意，一些测试表明预测 100 个项目的速度与预测 1 基本相同。

**Prediction: 1 observations**
predict.type: prob
threshold: 0=0.50,1=0.50
**time: 70.25**

**Prediction: 100 observations**
predict.type: prob
threshold: 0=0.50,1=0.50
**time: 69.82**

https://kogalur.github.io/randomForestSRC/theory.html

score 3 · Accepted Answer

如果您没有绑定到分类森林的这个特定实现，您可能想尝试一下 ranger ("classif.ranger")。

您可以在此处找到实现的比较： https ://www.jstatsoft.org/article/view/v077i01

score 3 · Accepted Answer

您可以调整一些参数以减小模型的大小。尤其是：

减少ntree树木数量
每片叶子增加nodesize更多的数据点
减少nodedepth以获得较浅的树木

r - R - MLR - randomForestSRC - 模型规模巨大，预测时间非常慢 - 如何减少两者？

2 回答 2

Related

Reference