我试图找到 x 值、它们各自的指数和 alpha 的最佳组合,这将使我能够找到最低的均方误差。
我使用了 SKlearn 的 Lasso 回归,但到目前为止,我只能确定最小 MSE,以及创建它的变量组合。我不确定如何提取允许它的 alpha,或者如何查看变量组合是否有任何与它们相关的指数。
我取得的成果:
最佳套索回归模型的结果:最小平均测试 MSE:9172.38 变量组合:['Date', 'Cargo_size', 'Parcel_size', 'Rest', 'Sub']
x_combos = []
for n in range(1,9):
combos = combinations(['Date', 'Cargo_size', 'Parcel_size', 'Rest', 'Age',\
'Sub', 'X_coord', 'Y_coord'], n)
x_combos.extend(combos)
lasso_models = {}
alphas = 10**np.linspace(10,-2, 100)*.5
for n in range(0, len(x_combos)):
combo_list = list(x_combos[n])
x = data[combo_list]
poly = PolynomialFeatures(3)
poly_x = poly.fit_transform(x)
model = Lasso(max_iter=100000, normalize=(True))
for a in alphas:
model.set_params(alpha = a)
model.fit(poly_x,y) #
lasso_cv_scores = cross_validate(model, poly_x, y, cv=10, scoring=('neg_mean_squared_error', 'r2'), return_train_score=(True), return_estimator=(True))
lasso_models[str(combo_list)] = np.mean(lasso_cv_scores['test_neg_mean_squared_error'])
print("outcomes from the Best Lasso Regression Model:")
min_mse = abs(max(lasso_models.values()))
print("minimum Avg Test MSE:", min_mse.round(2))
for possibles, i in lasso_models.items():
if i == -min_mse:
print("The Combination of Variables:", possibles)