0

我正在尝试“微调”一个预先训练的 Roberta 模型关闭拥抱面,但我一直遇到错误,当前错误是在创建模型时。最小可重现示例:

import tensorflow as tf
from transformers import AutoTokenizer, TFRobertaForSequenceClassification

bert_layer = TFRobertaForSequenceClassification.from_pretrained("keshan/SinhalaBERTo")

def create_model():
    input_word_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
                                       name="input_word_ids")
    input_mask = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
                                   name="input_mask")
    input_type_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
                                    name="input_type_ids")
    
    pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, input_type_ids])
    drop = tf.keras.layers.Dropout(0.4)(pooled_output)
    output = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(drop)
    
    model= tf.keras.Model(
        inputs={
            'input_word_ids':input_word_ids,
            'input_mask':input_mask,
            'input_type_ids':input_type_ids
        },
        outputs=output)
    return model



label_list = [0,1]
max_seq_length = 128
train_batch_size= 32

model = create_model()

我明白了

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [32], in <module>
----> 1 model = create_model()
      2 model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5),
      3              loss=tf.keras.losses.BinaryCrossentropy(),
      4              metrics=[tf.keras.activations.BinaryAccuracy()])
      5 model.summary()

Input In [30], in create_model()
      4 input_mask = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
      5                                name="input_mask")
      6 input_type_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
      7                                 name="input_type_ids")
----> 9 pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, input_type_ids])
     11 drop = tf.keras.layers.Dropout(0.4)(pooled_output)
     12 output = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(drop)

ValueError: not enough values to unpack (expected 2, got 1)

导入时还有一个可能相关的警告:

"All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.

Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at keshan/SinhalaBERTo and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference."

我只需要正确地创建没有错误的模型,无论是使用当前函数还是修改后的函数,这样我就可以使用我的数据集对其进行训练。

4

0 回答 0