python - RFECV 未按预期指示 Top 5 功能

翻译自：https://stackoverflow.com/questions/69557685 2021-10-13T14:49:30.790

16 次

我正在关注这些 scikit-learn 文档。我的代码在底部供参考。

文档示例指出The following example shows how to retrieve the a-priori not known 5 informative features，但是当我运行类似代码时，我只能获得 1 或 2 个功能，具体取决于我使用的估算器。

此外，我已经手动探索了这些变量，并且我知道我可以通过包含其他变量轻松改进调整后的 r 平方。所以我的问题有两个：

RFECV 使用什么标准来挑选赢家和输家？
在文档示例中，是否只选择 5 是因为这恰好是 Rank 1 结果的数量？

我的代码：

deskewed = analysis.getDeskewedData()

kitchen_sink_formula = '''hirability ~
    + gender*favor_programming_career*favor_seeking_risk*industry
    + 1'''

kitchen_sink_model = sm.OLS.from_formula(kitchen_sink_formula, data=deskewed)

y = kitchen_sink_model.endog
X = kitchen_sink_model.exog

estimator = LinearRegression()

rfe_model =  RFECV(estimator, step=1, cv=5)
results = rfe_model.fit(X, y)

print(results.support_) # only 1 feature is TRUE

python - RFECV 未按预期指示 Top 5 功能

0 回答 0

Related

Reference