0

按照本教程 https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html

具体来说,关于使用整数排序的数据而不是 one-hot 编码的部分。

  • 从下面的错误消息中,我可以看到它与维度冲突有关。我按照原始教程进行操作,当编码器、解码器输入和解码器输出的形状相同时(因为使用一个热向量对数据进行预处理),没有出现此类问题。这让我相信我的 .Model 输入的尺寸是罪魁祸首。但是,我希望使用嵌入层应用掩码,这需要使用整数序列数据。

编码器输入形状:TensorShape([None, None])

解码器输入形状:TensorShape([None, None])

解码器输出形状:TensorShape([None, None, 99])

我发现的每个教程似乎都表明这不是问题(?)。

模型摘要

Traceback (most recent call last):
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1619, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 160 and 99 for 'loss/dense_loss/mul' (op: 'Mul') with input shapes: [?,160], [?,160,99].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/bakes/OneDrive - The Pennsylvania State University/devops/repos/thesis_math_language_processing/main.py", line 18, in <module>
    network.train()
  File "C:\Users\bakes\OneDrive - The Pennsylvania State University\devops\repos\thesis_math_language_processing\architectures\test.py", line 90, in train
    model = keras.models.Model([encoder_inputs, decoder_inputs], decoder_outputs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 342, in fit
    total_epochs=epochs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 128, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 98, in execution_function
    distributed_function(input_fn))
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 568, in __call__
    result = self._call(*args, **kwds)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 615, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 497, in _initialize
    *args, **kwds))
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\function.py", line 2389, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\function.py", line 2703, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\function.py", line 2593, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\func_graph.py", line 978, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\eager\def_function.py", line 439, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 85, in distributed_function
    per_replica_function, args=args)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 763, in experimental_run_v2
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 1819, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\distribute\distribute_lib.py", line 2164, in _call_for_each_replica
    return fn(*args, **kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 292, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_v2_utils.py", line 433, in train_on_batch
    output_loss_metrics=model._output_loss_metrics)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 312, in train_on_batch
    output_loss_metrics=output_loss_metrics))
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 253, in _process_single_batch
    training=training))
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\engine\training_eager.py", line 167, in _model_loss
    per_sample_losses = loss_fn.call(targets[i], outs[i])
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\losses.py", line 221, in call
    return self.fn(y_true, y_pred, **self._fn_kwargs)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\losses.py", line 971, in categorical_crossentropy
    return K.categorical_crossentropy(y_true, y_pred, from_logits=from_logits)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\keras\backend.py", line 4495, in categorical_crossentropy
    return -math_ops.reduce_sum(target * math_ops.log(output), axis)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 902, in binary_op_wrapper
    return func(x, y, name=name)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 1201, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6125, in mul
    "Mul", x=x, y=y, name=name)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 742, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\func_graph.py", line 595, in _create_op_internal
    compute_device)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3322, in _create_op_internal
    op_def=op_def)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1786, in __init__
    control_input_ops)
  File "C:\Users\bakes\Anaconda3\envs\math_env\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1622, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 160 and 99 for 'loss/dense_loss/mul' (op: 'Mul') with input shapes: [?,160], [?,160,99].
    def train(self):
        tensorboard_callback = keras.callbacks.TensorBoard(log_dir=definitions.LOGDIR)

        processor = preprocessing.processor()
        train_x, train_y, test_x, test_y = processor.get_data(n_data=self.n_train)
        encoder_input_data, decoder_input_data, decoder_target_data = processor.preprocess_sequence([train_x, train_y])

        latent_dim = p.hidden_size
        num_decoder_tokens = p.vocab_size + 1
        num_encoder_tokens = p.vocab_size + 1

        # Embedding
        encoder_inputs = keras.layers.Input(shape=(None, ))
        x = keras.layers.Embedding(num_encoder_tokens, latent_dim, mask_zero=True)(encoder_inputs)
        x, state_h, state_c = keras.layers.LSTM(latent_dim, return_state=True)(x)
        # We discard `encoder_outputs` and only keep the states.
        encoder_states = [state_h, state_c]

        # Set up the decoder, using `encoder_states` as initial state.
        decoder_inputs = keras.layers.Input(shape=(None,))
        x = keras.layers.Embedding(num_encoder_tokens, latent_dim, mask_zero=True)(decoder_inputs)
        x, _, _ = keras.layers.LSTM(latent_dim, return_sequences=True, return_state=True)(x, initial_state=encoder_states)
        pdb.set_trace()
        decoder_outputs = keras.layers.Dense(num_decoder_tokens, activation='softmax')(x)

        # Define the model that will turn
        # `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
        pdb.set_trace()
        model = keras.models.Model([encoder_inputs, decoder_inputs], decoder_outputs)

        # Run training
        model.compile(optimizer='adam', loss='categorical_crossentropy',
                      metrics=['accuracy'])
        model.summary()

        history = model.fit([encoder_input_data, decoder_input_data],
                            decoder_target_data,
                            batch_size=64,
                            epochs=self.n_epochs,
                            callbacks=[tensorboard_callback],
                            validation_split=0.2)

4

1 回答 1

0

由于您已经在评论部分提到了答案,因此我在此处发布了在社区的以下代码中对解码器目标数据进行单热编码的方法。

decoder_target_data = np.zeros(
    (len(input_texts), max_decoder_seq_length, num_decoder_tokens),
    dtype='float32')

for i, (input_text, target_text) in enumerate(zip(input_texts, target_texts)):
    for t, char in enumerate(input_text):
        encoder_input_data[i, t, input_token_index[char]] = 1.
    encoder_input_data[i, t + 1:, input_token_index[' ']] = 1.
    for t, char in enumerate(target_text):
        # decoder_target_data is ahead of decoder_input_data by one timestep
        decoder_input_data[i, t, target_token_index[char]] = 1.
        if t > 0:
            # decoder_target_data will be ahead by one timestep
            # and will not include the start character.
            decoder_target_data[i, t - 1, target_token_index[char]] = 1.
    decoder_input_data[i, t + 1:, target_token_index[' ']] = 1.
    decoder_target_data[i, t:, target_token_index[' ']] = 1.
于 2020-04-21T12:32:38.747 回答