我在数据集上执行了岭回归模型(数据集链接:https ://www.kaggle.com/c/house-prices-advanced-regression-techniques/data )如下:
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
y = train['SalePrice']
X = train.drop("SalePrice", axis = 1)
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.30)
ridge = Ridge(alpha=0.1, normalize=True)
ridge.fit(X_train,y_train)
pred = ridge.predict(X_test)
我使用来自 sklearn 的指标库计算了 MSE
from sklearn.metrics import mean_squared_error
mean = mean_squared_error(y_test, pred)
rmse = np.sqrt(mean_squared_error(y_test,pred)
我得到了一个非常大的 MSE =554084039.54321
和 RMSE =值21821.8
,我试图了解我的实现是否正确。