我正在尝试“微调”一个预先训练的 Roberta 模型关闭拥抱面,但我一直遇到错误,当前错误是在创建模型时。最小可重现示例:
import tensorflow as tf
from transformers import AutoTokenizer, TFRobertaForSequenceClassification
bert_layer = TFRobertaForSequenceClassification.from_pretrained("keshan/SinhalaBERTo")
def create_model():
input_word_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
name="input_word_ids")
input_mask = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
name="input_mask")
input_type_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
name="input_type_ids")
pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, input_type_ids])
drop = tf.keras.layers.Dropout(0.4)(pooled_output)
output = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(drop)
model= tf.keras.Model(
inputs={
'input_word_ids':input_word_ids,
'input_mask':input_mask,
'input_type_ids':input_type_ids
},
outputs=output)
return model
label_list = [0,1]
max_seq_length = 128
train_batch_size= 32
model = create_model()
我明白了
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [32], in <module>
----> 1 model = create_model()
2 model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5),
3 loss=tf.keras.losses.BinaryCrossentropy(),
4 metrics=[tf.keras.activations.BinaryAccuracy()])
5 model.summary()
Input In [30], in create_model()
4 input_mask = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
5 name="input_mask")
6 input_type_ids = tf.keras.layers.Input(shape=(max_seq_length,), dtype=tf.int32,
7 name="input_type_ids")
----> 9 pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, input_type_ids])
11 drop = tf.keras.layers.Dropout(0.4)(pooled_output)
12 output = tf.keras.layers.Dense(1, activation='sigmoid', name="output")(drop)
ValueError: not enough values to unpack (expected 2, got 1)
导入时还有一个可能相关的警告:
"All model checkpoint layers were used when initializing TFRobertaForSequenceClassification.
Some layers of TFRobertaForSequenceClassification were not initialized from the model checkpoint at keshan/SinhalaBERTo and are newly initialized: ['classifier']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference."
我只需要正确地创建没有错误的模型,无论是使用当前函数还是修改后的函数,这样我就可以使用我的数据集对其进行训练。