python - 为 XGBoost.XGBRegressor 创建自定义目标函数

Question

所以我对 python 中的 ML/AI 游戏比较陌生，我目前正在解决一个围绕 XGBoost 的自定义目标函数实现的问题。

我的微分方程知识相当生疏，所以我创建了一个自定义 obj 函数，它具有梯度和粗麻布，它对作为 XGBRegressor 中的默认目标函数运行的均方误差函数进行建模，以确保我正确地完成了所有这些。问题是，模型的结果（错误输出很接近，但在大多数情况下并不相同（并且在某些点上有所偏离）。我不知道我做错了什么，或者如果我这样做怎么可能我正在正确地计算事情。如果你们都可以看看这个，也许可以深入了解我错在哪里，那就太棒了！

没有自定义函数的原始代码是：

    import xgboost as xgb

    reg = xgb.XGBRegressor(n_estimators=150, 
                   max_depth=2,
                   objective ="reg:squarederror", 
                   n_jobs=-1)

    reg.fit(X_train, y_train)

    y_pred_test = reg.predict(X_test)

我的 MSE 自定义目标函数如下：

    def gradient_se(y_true, y_pred):
        #Compute the gradient squared error.
        return (-2 * y_true) + (2 * y_pred)

    def hessian_se(y_true, y_pred):
        #Compute the hessian for squared error
        return 0*(y_true + y_pred) + 2

   def custom_se(y_true, y_pred):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_true, y_pred)
        hess = hessian_se(y_true, y_pred)
        return grad, hess

文档参考在这里

谢谢！

score 6 · Accepted Answer

According to the documentation, the library passes the predicted values (y_pred in your case) and the ground truth values (y_true in your case) in this order.

You pass the y_true and y_pred values in reversed order in your custom_se(y_true, y_pred) function to both the gradient_se and hessian_se functions. For the hessian it doesn't make a difference since the hessian should return 2 for all x values and you've done that correctly.

For the gradient_se function you've incorrect signs for y_true and y_pred.

The correct implementation is as follows:

    def gradient_se(y_pred, y_true):
        #Compute the gradient squared error.
        return 2*(y_pred - y_true)

    def hessian_se(y_pred, y_true):
        #Compute the hessian for squared error
        return 0*y_true + 2

   def custom_se(y_pred, y_true):
        #squared error objective. A simplified version of MSE used as
        #objective function.

        grad = gradient_se(y_pred, y_true)
        hess = hessian_se(y_pred, y_true)
        return grad, hess

Update: Please keep in mind that the native XGBoost implementation and the implementation of the sklearn wrapper for XGBoost use a different ordering of the arguments. The native implementation takes predictions first and true labels (dtrain) second, while the sklearn implementation takes the true labels (dtrain) first and the predictions second.

python - 为 XGBoost.XGBRegressor 创建自定义目标函数

1 回答 1

Related

Reference