Catboost 现在支持early_stopping_rounds
:fit 方法参数
将过拟合检测器类型设置为 Iter 并在指定迭代次数后停止训练,因为迭代具有最佳度量值。
这与 xgboost 中的工作方式非常相似early_stopping_rounds
。
这是一个例子:
from catboost import CatBoostRegressor, Pool
from sklearn.model_selection import train_test_split
import numpy as np
y = np.random.normal(0, 1, 1000)
X = np.random.normal(0, 1, (1000, 1))
X[:, 0] += y * 2
X_train, X_eval, y_train, y_eval = train_test_split(X, y, test_size=0.1)
train_pool = Pool(X, y)
eval_pool = Pool(X_eval, y_eval)
model = CatBoostRegressor(iterations=1000, learning_rate=0.1)
model.fit(X, y, eval_set=eval_pool, early_stopping_rounds=10)
结果应该是这样的:
522: learn: 0.3994718 test: 0.4294720 best: 0.4292901 (514) total: 957ms remaining: 873ms
523: learn: 0.3994580 test: 0.4294614 best: 0.4292901 (514) total: 958ms remaining: 870ms
524: learn: 0.3994495 test: 0.4294806 best: 0.4292901 (514) total: 959ms remaining: 867ms
Stopped by overfitting detector (10 iterations wait)
bestTest = 0.4292900745
bestIteration = 514
Shrink model to first 515 iterations.