我正在做流失分析。我用了
randomcv = RandomizedSearchCV(estimator=clf,param_distributions = params_grid,
cv=kfoldcv,n_iter=100, n_jobs=-1, scoring='roc_auc')
一切都很好,但后来,我用这种方式尝试了自定义评分功能
def gain_fn(y_true, y_prob):
tp = np.where((y_prob>=0.025) & (y_true==1), 40000, 0)
fp = np.where((y_prob>=0.025) & (y_true==0), -1000, 0)
return np.sum([tp,fp])
scorer_fn = make_scorer(gain_fn, greater_is_better = True, needs_proba=True)
randomcv = RandomizedSearchCV(estimator=clf,param_distributions = params_grid,
cv=kfoldcv,n_iter=100, n_jobs=-1, scoring=scorer_fn)
但我想在 gain_fn 内部用某个类的值进行计算(它有 3 个可能的值)。如何选择正确的 y_pred 参数?有什么建议吗?谢谢!