neural-network - 如何处理 MLPRegressor 中简单 (X,Y) 数据的过度拟合

Question

处理少量数据，并使用折叠处理过度拟合[GridSearchCV]

我完全不知道如何从我的模型中获得更好的估计。似乎当我尝试运行我的代码时，我得到了负面的准确性。我怎样才能提高 cross_val_score 或测试分数或任何你想称之为的东西，以便我可以更可靠地预测值。

我尝试添加更多数据（从 50 到 200+）。

我尝试了随机参数（并意识到这是一种天真的方法）

我还尝试使用 StandardScaler 在功能上清理我的数据

有人有什么建议吗？

from sklearn.neural_network import MLPRegressor
from sklearn import preprocessing
import requests
import json
from calendar import monthrange
import numpy as np
from sklearn.model_selection import cross_val_score, GridSearchCV
from sklearn.preprocessing import scale


r =requests.get('https://www.alphavantage.co/query?function=TIME_SERIES_WEEKLY_ADJUSTED&symbol=W&apikey=QYQ2D6URDOKNUGF4')

#print(r.text)

y = json.loads(r.text)
#print(y["Monthly Adjusted Time Series"].keys())

keysInResultSet = y["Weekly Adjusted Time Series"].keys()
#print(keysInResultSet)

featuresListTemp = []
labelsListTemp = []

count = 0;

for i in keysInResultSet: 
    #print(i)
    count = count + 1;
    #print(y["Monthly Adjusted Time Series"][i])
    tmpList = []
    tmpList.append(count)
    featuresListTemp.append(tmpList)
    strValue = y["Weekly Adjusted Time Series"][i]["5. adjusted close"]
    numValue = float(strValue)
    labelsListTemp.append(numValue)


print("TOTAL SET")
print(featuresListTemp)
print(labelsListTemp)
print("---")

arrTestInput = []
arrTestOutput = []


print("SCALING SET")
X_train = np.array(featuresListTemp)
scaler = preprocessing.StandardScaler().fit(X_train)

X_train_scaled = scaler.transform(X_train)
print(X_train_scaled)


product_model = MLPRegressor()
#10.0 ** -np.arange(1, 10)

#todo : once found general settings, iterate through some more seeds to find one that can be used on the training

parameters = {'learning_rate': ['constant','adaptive'],'solver': ['lbfgs','adam'], 'tol' : 10.0 ** -np.arange(1, 4), 'verbose' : [True], 'early_stopping': [True], 'activation' : ['tanh','logistic'], 'learning_rate_init': 10.0 ** -np.arange(1, 4), 'max_iter': [4000], 'alpha': 10.0 ** -np.arange(1, 4), 'hidden_layer_sizes':np.arange(1,11), 'random_state':np.arange(1, 3)}
clf = GridSearchCV(product_model, parameters, n_jobs=-1)
clf.fit(X_train_scaled, labelsListTemp)
print(clf.score(X_train_scaled, labelsListTemp))
print(clf.best_params_)

best_params = clf.best_params_


newPM = MLPRegressor(hidden_layer_sizes=((best_params['hidden_layer_sizes'])), #try reducing the layer size / increasing it and playing around with resultFit variable
                                     batch_size='auto',
                                     power_t=0.5,
                                     activation=best_params['activation'],
                                     solver=best_params['solver'], #non scaled input
                                     learning_rate=best_params['learning_rate'],
                                     max_iter=best_params['max_iter'],
                                     learning_rate_init=best_params['learning_rate_init'],
                                     alpha=best_params['alpha'],
                                     random_state=best_params['random_state'],
                                     early_stopping=best_params['early_stopping'],
                                     tol=best_params['tol'])

scores = cross_val_score(newPM, X_train_scaled, labelsListTemp, cv=10, scoring='neg_mean_absolute_error')
print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))

print(scores)

第 63 行及以下的输出

0.9142644531564619 {'activation': 'logistic', 'alpha': 0.001, 'early_stopping': True, 'hidden_layer_sizes': 7, 'learning_rate': 'constant', 'learning_rate_init': 0.1, 'max_iter': 4000, 'random_state '：2，'求解器'：'lbfgs'，'tol'：0.01，'详细'：真}

精度：-21.91 (+/- 58.89) [ -32.87854574 -105.0632913
-22.89836453 -7.33154414 -22.38773819 -3.3786339 -1.7658796 -3.78002866 -4.78727388 -14.8]

score 0 · Accepted Answer

{'activation': 'logistic', 'alpha': 0.01, 'early_stopping': True, 'hidden_layer_sizes': 30, 'learning_rate': 'constant', 'learning_rate_init': 0.1, 'max_iter': 4000, 'random_state' ：2，“求解器”：“lbfgs”，“tol”：0.1，“详细”：真}

{'activation': 'tanh', 'alpha': 0.01, 'early_stopping': True, 'hidden_layer_sizes': 99, 'learning_rate': 'constant', 'learning_rate_init': 0.1, 'max_iter': 4000, 'random_state' ：1，“求解器”：“lbfgs”，“tol”：0.01，“详细”：真}

上述两种配置都适用于样本集。谢谢大家，如果有任何问题请告诉我。这可以通过缩小所有其他参数来解决，即。而不是 10.0 ** -np.arange(1, 3) 做 10.0 ** -np.arange(1, 2)

到一个更有限的集合。开始删除你知道是正确的参数（很难做到，但一个可能是 learning_rate='constant' 因为我注意到我所有的最佳拟合都导致了一个恒定的学习率，而不管任何其他参数。

这主要用于时间优化，但也有助于随着网络中节点数量的增加而过度拟合。这个想法是，一旦您执行第一次网格搜索，您希望在不损失太多真实函数的泛化属性的情况下增加 N 度的拟合。

您应该开始网格搜索，确保隐藏节点的数量位于输入节点的数量和输出节点的数量之间。

一旦找到合适的拟合，您可以通过增加节点数量来提高拟合。您必须注意不要添加太多节点，以免失去真实函数的泛化能力。在您开始考虑扩大规模之前，您必须开始降低参数的复杂性，以便在您的第二次网格搜索中，您将在更多数量的节点上使用更通用的参数执行它。

上面描述了参数的泛化，第二次网格搜索考虑了来自初始搜索的更一般的参数，同时增加了网络节点。

我知道这很令人困惑，但正是它帮助我体面地适应了这一点。

对于任何苦苦挣扎的人，我会尝试 0）在执行搜索并获得体面的模型后进行泛化 1）在增加节点的第二次搜索中使用泛化 2）在放大时使用 alpha 参数（您可以推广的其余参数）3）根据情况添加一些不同的种子或删除它们 4）虽然更改 tol 会改变拟合，但它也高度依赖于迭代次数。出于这个原因，根据具体情况，合理的数字可能是 .01 或 .001（合理的取决于您要等待给定结果/机会收敛的迭代次数）如果 tol 设置得太低，您将运行没有迭代，因为每个 epoch 永远不会有机会提前停止。

neural-network - 如何处理 MLPRegressor 中简单 (X,Y) 数据的过度拟合

1 回答 1

Related

Reference