0

当我调用 tf2 模型时,它不会按照我在 tf Model 子类中定义 call() 方法的方式返回应该返回的值。

相反,调用模型的 call() 方法将返回我在 build() 方法中定义的张量

为什么会发生这种情况,我该如何解决?

import numpy as np
import tensorflow as tf

num_items = 1000
emb_dim = 32
lstm_dim = 32

class rnn_model(tf.keras.Model):
    def __init__(self, num_items, emb_dim): 
        super(rnn_model, self).__init__()
        
        self.emb   = tf.keras.layers.Embedding(num_items, emb_dim, name='embedding_layer')
        self.GRU   = tf.keras.layers.LSTM(lstm_dim, name='rnn_layer')
        self.dense = tf.keras.layers.Dense(num_items, activation = 'softmax', name='final_layer')
        
    def call(self, inp, is_training=True):  
        emb = self.emb(inp)
        gru = self.GRU(emb)
        # logits=self.dense(gru)
        
        return gru # (bs, lstm_dim=50)
    
    def build(self, inp_shape):
      x = tf.keras.Input(shape=inp_shape, name='input_layer')
      # return tf.keras.Model(inputs=[x], outputs=self.call(x))
      return tf.keras.Model(inputs=[x], outputs=self.dense(self.call(x)))

maxlen = 10
model = rnn_model(num_items, emb_dim).build((maxlen, ))
model.summary()

gru_out = model(inp)
print(gru_out.shape) # should have been (bs=16, lstm_dim=32)

以下是我得到的输出 -

Model: "functional_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_layer (InputLayer)     [(None, 10)]              0         
_________________________________________________________________
embedding_layer (Embedding)  (None, 10, 32)            32000     
_________________________________________________________________
rnn_layer (LSTM)             (None, 32)                8320      
_________________________________________________________________
final_layer (Dense)          (None, 1000)              33000     
=================================================================
Total params: 73,320
Trainable params: 73,320
Non-trainable params: 0
_________________________________________________________________
(16, 1000)

我打算只使用模型末尾的“final_layer”或密集层,将其输入到采样的 softmax 函数中,与 gru_out 一起使用来计算损失(以便训练模型)。

在测试时,我打算手动将 gru_out 传递给 model.get_layer('final_layer') 以获得最终的 logits。

4

0 回答 0