我正在 python 中使用 H2O 制作分类模型。我能够构建 GBM 模型并对训练和测试数据集进行预测,而当我构建 XGBoost 模型并尝试进行预测时。
以下是 GBM 代码:(运行良好)
from h2o.estimators.gbm import H2OGradientBoostingEstimator
model_rf_v3 = H2OGradientBoostingEstimator(model_id='mojo_test_v4', ntrees=259, max_depth=6, categorical_encoding = 'OneHotExplicit', learn_rate = 0.1)
model_rf_v3.train(y = myResponse_rf,x = myCat + myNum, training_frame=hf_train_h2o,
validation_frame = hf_test_h2o)
pred = model_rf_v3.predict(hf_test_h2o)[:,2]
XGBoost 代码:(失败)
from h2o.estimators.xgboost import H2OXGBoostEstimator
model_rf_vn = H2OXGBoostEstimator(ntrees=259, learn_rate = 0.05, stopping_metric = "misclassification", categorical_encoding = 'OneHotExplicit', tree_method="hist", grow_policy="lossguide", max_depth = 9)
model_rf_vn.train(y = myResponse_rf,x = myCat + myNum, training_frame=hf_train_h2o,
validation_frame = hf_test_h2o)
pred = model_rf_vn.predict(hf_test_h2o)[:,2] ## Error at this point
错误:
job with key $0300ffffffff$_9cacec5e7e4540e343f43ac2ce3e459e failed with an exception: DistributedException from ha880datanode-14.fab4.prod.booking.com/10.220.205.163:54321: '63', caused by java.lang.ArrayIndexOutOfBoundsException: 63"
如果错误出现在数据集中,我认为 GBM 也应该给出错误。XGBoost 的预测功能是否以不同的方式工作,还是我遗漏了什么?
提前致谢,
维沙尔