我想先说我对使用 xgboost、pandas 和 numpy 还是很陌生。
目前我正在基于 kelly 标准为 XGBoost 实现自定义 OBJ 函数。这种方法取自 datascience.stackexchange 上的另一篇文章:https ://datascience.stackexchange.com/questions/16186/kelly-criterion-in-xgboost-loss-function
通过阅读 XGBoost 的文档,我需要返回梯度和粗麻布。( https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html ) 函数的梯度为:
函数的粗麻布为:
在哪里:
b = 投注获得的赔率
p = 获胜的概率
x = 算法预测
为此,我将 p 视为二进制变量,1 或 0,以判断下注是否成功。
因此,p = 真实结果,1 或 0
使用我编写以下代码的文档,我还提供了一个小型示例数据集:
kell_train_data = np.array([0.08396877, 0.07131547, 0.17921676, 0.22317006, 0.06278754, 0.29874458, 0.08079682, 0.13074108, 0.06416036], 0.12209199, 0.10400956, 0.28764891, 0.2913481, 0.09450234, 0.07858831, 0.09246751, 0.17008012, 0.29026032, 0.2741014 , 0.05574227)
odds_train = np.array([0.149254, 0.108696, 0.312500, 0.217391, 0.061350, 0.208333, 0.178571, 0.065359, 0.037453, 0.107527, 0.256410, 0.400000, 0.370370, 0.085470, 0.058140, 0.204082, 0.476190, 0.294118, 0.121951, 0.033003])
y_train = np.array([0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0]
kell_train_data = kell_train_data.reshape(kell_train_data.shape[0], -1)
def gradient(y_pred, y_true, odds = odds_train):
"Compute gradient of betting function"
return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))
def hessian(y_pred, y_true, odds = odds_train):
"compute hessian of betting function"
return (-(((odds**2)*y_true )/(odds*y_pred+1)**2)-((1-y_true)/((1-y_pred)**2)))
def kellyobjfunc(y_pred, y_true, odds = odds_train) :
"kelly objective function for xgboost"
grad = gradient(y_pred, y_true, odds)
hess = hessian(y_pred, y_true, odds)
return grad, hess
kell_mod = xgb.XGBClassifier(objective = kellyobjfunc, maximize = True)
kell_mod.fit(kell_train_data, y_train)
但是,当我运行上述代码时,出现以下错误:
Traceback (most recent call last):
File "<ipython-input-623-18279e95b288>", line 1, in <module>
kell_mod.fit( kell_target, y_train)
File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\core.py", line 422, in inner_f
return f(**kwargs)
File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py", line 919, in fit
callbacks=callbacks)
File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py", line 214, in train
early_stopping_rounds=early_stopping_rounds)
File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\training.py", line 101, in _train_internal
bst.update(dtrain, i, obj)
File "C:\Users\USERR\Anaconda3\lib\site-packages\xgboost\core.py", line 1285, in update
grad, hess = fobj(pred, dtrain)
File "C:\Users\USER\Anaconda3\lib\site-packages\xgboost\sklearn.py", line 49, in inner
return func(labels, preds)
File "<ipython-input-621-35f90873cb76>", line 14, in kellyobjfunc
grad = gradient(y_pred, y_true, odds)
File "<ipython-input-621-35f90873cb76>", line 5, in gradient
return (((-(odds+1)*y_true +odds*y_pred+1)/((y_pred-1)(odds*y_pred+1))))
TypeError: 'numpy.ndarray' object is not callable
我不确定是什么导致了这个问题。任何见解或帮助将不胜感激。