我正在训练 LSTM 来预测时间序列。我尝试了一个编码器 - 解码器,没有任何辍学。我将我的数据分为 70% 的训练和 30% 的验证。训练集和验证集的总点数分别在 107 和 47 左右。然而,验证损失总是大于训练损失。下面是代码。
seed(12346)
tensorflow.random.set_seed(12346)
Lrn_Rate=0.0005
Momentum=0.8
sgd=SGD(lr=Lrn_Rate, decay = 1e-6, momentum=Momentum, nesterov=True)
adam=Adam(lr=Lrn_Rate, beta_1=0.9, beta_2=0.999, amsgrad=False)
optimizernme=sgd
optimizernmestr='sgd'
callbacks= EarlyStopping(monitor='loss',patience=50,restore_best_weights=True)
train_X1 = numpy.reshape(train_X1, (train_X1.shape[0], train_X1.shape[1], 1))
test_X1 = numpy.reshape(test_X1, (test_X1.shape[0], test_X1.shape[1], 1))
train_Y1 = train_Y1.reshape((train_Y1.shape[0], train_Y1.shape[1], 1))
test_Y1= test_Y1.reshape((test_Y1.shape[0], test_Y1.shape[1], 1))
model = Sequential()
Hiddenunits=240
DenseUnits=100
n_features=1
n_timesteps= look_back
model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True,input_shape=
(n_timesteps, n_features))))#90,120 worked for us uk
model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=False)))
model.add(RepeatVector(1))
model.add(Bidirectional(LSTM( Hiddenunits, activation='relu',return_sequences=True)))
model.add(Bidirectional(LSTM(Hiddenunits, activation='relu', return_sequences=True)))
model.add(TimeDistributed(Dense(DenseUnits, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mean_squared_error', optimizer=optimizernme)
history=model.fit(train_X1,train_Y1,validation_data(test_X1,test_Y1),batch_size=batchsize,epochs=250,
callbacks=[callbacks,TqdmCallback(verbose=0)],shuffle=True,verbose=0)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss'+ modelcaption)
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
训练损失大于验证损失。训练损失 = 0.02 和验证损失约为 0.004 请附上图片。我尝试了很多事情,包括辍学和添加更多隐藏单元,但它并没有解决问题。任何意见建议表示赞赏