python - Python中的随机森林实现

Question

全部！

有人可以给我关于 Python 中随机森林实现的建议吗？理想情况下，我需要输出尽可能多的分类器信息，尤其是：

训练集中的哪些向量用于训练每个决策树
在每棵树的每个节点中随机选择哪些特征，训练集中的哪些样本最终在该节点中，选择哪些特征进行拆分，以及使用哪个阈值进行拆分

我发现了很多实现，最著名的可能来自 scikit，但不清楚如何在那里执行 (1) 和 (2)（请参阅此问题）。其他实现似乎也有同样的问题，除了来自 openCV 的那个，但它是用 C++ 编写的（python 接口不涵盖随机森林的所有方法）。

有人知道满足（1）和（2）的东西吗？或者，知道如何改进 scikit 实现以获得功能 (1) 和 (2)？

解决：查看了sklearn.tree._tree.Tree的源码。它有很好的评论（完全描述了树）：

 children_left : int*
    children_left[i] holds the node id of the left child of node i.
    For leaves, children_left[i] == TREE_LEAF. Otherwise,
    children_left[i] > i. This child handles the case where
    X[:, feature[i]] <= threshold[i].

children_right : int*
    children_right[i] holds the node id of the right child of node i.
    For leaves, children_right[i] == TREE_LEAF. Otherwise,
    children_right[i] > i. This child handles the case where
    X[:, feature[i]] > threshold[i].

feature : int*
    feature[i] holds the feature to split on, for the internal node i.

threshold : double*
    threshold[i] holds the threshold for the internal node i.

score 2 · Accepted Answer

您可以在 scikit-learn 中获得几乎所有信息。究竟是什么问题？您甚至可以使用点来可视化树木。我认为您无法找出随机抽样的拆分候选人，但您可以找出最终选择了哪些。编辑：查看决策树tree_的属性。我同意，它没有很好的记录。确实应该有一个示例来可视化叶子分布等。您可以查看可视化功能以了解如何获取属性。

python - Python中的随机森林实现

1 回答 1

Related

Reference