python - 如何在 Scikit-Learn 中修改决策树算法中的分裂标准（基尼/熵）？

Question

我在二元分类问题上使用决策树算法，目标是最大限度地减少分类的误报（最大化positive predicted value）（诊断工具的成本非常高）。

有没有办法引weight入基尼系数/熵分裂标准来惩罚误报错误分类？

因此我想知道是否有任何方法可以在 Scikit-learn 中实现它？

编辑

玩弄class_weight产生了以下结果：

from sklearn import datasets as dts
iris_data = dts.load_iris()

X, y = iris_data.features, iris_data.targets
# take only classes 1 and 2 due to less separability
X = X[y>0]
y = y[y>0]
y = y - 1 # make binary labels

# define the decision tree classifier with only two levels at most and no class balance
dt = tree.DecisionTreeClassifier(max_depth=2, class_weight=None)

# fit the model, no train/test for simplicity
dt.fit(X[:55,:2], y[:55])

绘制决策边界和树蓝色为正（1）：

在超过少数阶级（或更珍贵）的同时：

dt_100 = tree.DecisionTreeClassifier(max_depth=2, class_weight={1:100})

score 1 · Accepted Answer

决策树分类器支持该class_weight论点。

在两类问题中，这可以完全解决您的问题。通常这用于不平衡的问题。对于两个以上的课程，不可能提供单独的标签（据我所知）

python - 如何在 Scikit-Learn 中修改决策树算法中的分裂标准（基尼/熵）？

1 回答 1

Related

Reference