0

我正在研究一个文本分类问题,并尝试使用 Kerastuner 来确定我的 LSTM 网络的最佳配置。以下是相同的代码:

keras 调谐器

def build_model(hp):
  
  num_hidden_layers =1
  num_units = 8
  dropout_rate = 0.1
  learning_rate=0.01
  
  

  if hp:
    num_hidden_layers = hp.Int('num_hidden_layers', min_value=2, max_value=100, step=5)
    num_units = hp.Int('num_units', min_value=50, max_value=2000, step=50)
    dropout_rate = hp.Float('dropout_rate', min_value=0.1, max_value=0.5)
    learning_rate = hp.Float('learning_rate', min_value=0.0001, max_value=0.01)
    momentum_rate = hp.Float('momentum_rate', min_value=0.5, max_value=0.9)
    vocab_size = len(tokenizer.word_index)+1
    max_sequence_length = 500
    embedding_size = 300

  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Embedding(input_dim=vocab_size , output_dim=embedding_size, input_length=max_sequence_length , weights=[embedding_matrix], trainable=False))
  
  
  for _ in range(0,num_hidden_layers):
    
    model.add(tf.keras.layers.LSTM(num_units))
    model.add(tf.keras.layers.Dropout(dropout_rate))

  model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
  
  model.compile(
      loss = 'mse',
      optimizer =tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=momentum_rate),
      metrics = [tf.keras.metrics.BinaryCrossentropy(name='binary_crossentropy')]
  )
  return model

class CustomTuner(kerastuner.tuners.BayesianOptimization):
  def run_trial(self, trial, *args, **kwargs):
    kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 128,1024, step=32)
    super(CustomTuner, self).run_trial(trial,*args,**kwargs)

tuner = CustomTuner(
    build_model,
    objective=kerastuner.Objective('val_loss','min'),
    max_trials=2,
    executions_per_trial=1,
    directory='/dbfs/FileStore/GDPR_Dev/Data/',
    project_name = 'nn_logs_lstm_30062021',
    overwrite=True

代码失败并出现以下错误:

ValueError: 层 lstm_1 的输入 0 与层不兼容:预期 ndim=3,发现 ndim=2。收到的完整形状:(无,50)

谁能帮我解决这个问题?

4

1 回答 1

0

目前,您的数据是二维的 (N x M),但是输入数据应该是三维的。要解决这个问题,您应该将输入重塑为 N x M x 1 矩阵,如下所示:

x = np.reshape(x, (x.shape[0], x.shape[1], 1))

如果您的输入是多变量的,那么输入的所需形状将是 N x M x K,其中 k 是维数

于 2021-07-26T19:40:48.833 回答