出现以下错误: ValueError: feature_names mismatch: ['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9' , 'f10'] ['Name', 'year', 'month', 'License', 'Facility Type', 'City_Chicago', 'Latitude', 'Longitude', 'Risk', 'Inspection Type', 'Violations '] 输入数据训练数据中预期的f10、f6、f5、f0、f3、f2、f4、f7、f9、f1、f8没有以下字段:名称、月份、经度、City_Chicago、纬度、许可证、风险、设施类型、年份、检验类型、违规
代码如下。
print(clf)
下面是输出
RandomizedSearchCV(cv=None, error_score=nan,
estimator=XGBClassifier(base_score=0.5, booster='gbtree',
colsample_bylevel=1,
colsample_bynode=1,
colsample_bytree=1, gamma=0,
learning_rate=0.2, max_delta_step=0,
max_depth=7, min_child_weight=1,
missing=None, n_estimators=1000,
n_jobs=-1, nthread=None,
objective='binary:logistic',
random_state=2, reg_alpha=0,
reg_lambda...
verbosity=0),
iid='deprecated', n_iter=3, n_jobs=-1,
param_distributions={'xgbclassifier__learning_rate': [0.01,
0.05,
0.1,
0.15,
0.2],
'xgbclassifier__max_delta_step': [1, 2,
5],
'xgbclassifier__max_depth': [3, 5, 7,
9],
'xgbclassifier__n_estimators': [100,
200,
500,
1000]},
pre_dispatch='2*n_jobs', random_state=None, refit=True,
return_train_score=False, scoring='f1', verbose=1)
PDP 图
plt.rcParams['figure.dpi'] = 144
X_test_df = pd.DataFrame(X_test_processed, columns=X_test.columns)
feature = 'Violations' # Permutation 최상위
pdp_isolated = pdp_isolate(
model=clf,
dataset=X_test_df,
model_features=X_test_df.columns,
feature=feature
)
pdp_plot(pdp_isolated, feature_name=feature);
有什么问题?