有没有办法使用 GridSearchCV 或任何其他内置的 sklearn 函数来找到 OneClassSVM 分类器的最佳超参数?
我目前所做的是使用这样的训练/测试拆分自己执行搜索:
Gamma 和 nu 值定义为:
gammas = np.logspace(-9, 3, 13)
nus = np.linspace(0.01, 0.99, 99)
探索所有可能的超参数并找到最佳超参数的函数:
clf = OneClassSVM()
results = []
train_x = vectorizer.fit_transform(train_contents)
test_x = vectorizer.transform(test_contents)
for gamma in gammas:
for nu in nus:
clf.set_params(gamma=gamma, nu=nu)
clf.fit(train_x)
y_pred = clf.predict(test_x)
if 1. in y_pred: # Check if at least 1 review is predicted to be in the class
results.append(((gamma, nu), (accuracy_score(y_true, y_pred),
precision_score(y_true, y_pred),
recall_score(y_true, y_pred),
f1_score(y_true, y_pred),
roc_auc_score(y_true, y_pred),
))
)
# Determine and print the best parameter settings and their performance
print_best_parameters(results, best_parameters(results))
结果存储在形式的元组列表中:
((伽玛,nu)(accuracy_score,precision_score,recall_score,f1_score,roc_auc_score))
为了找到最佳准确度、f1、roc_auc 分数和参数,我编写了自己的函数:
最佳参数(结果)