我正在下载模型https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main microsoft/Multilingual-MiniLM-L12-H384 然后使用它。
变压器版本:'4.11.3'
我写了下面的代码:
import wandb
wandb.login()
%env WANDB_LOG_MODEL=true
model = tr.BertForSequenceClassification.from_pretrained("/home/pc/minilm_model",num_labels=2)
model.to(device)
print("hello")
training_args = tr.TrainingArguments(
report_to = 'wandb',
output_dir='/home/pc/proj/results2', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=32, # batch size for evaluation
learning_rate=2e-5,
warmup_steps=1000, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=1000,
evaluation_strategy="epoch",
save_strategy="no"
)
print("hello")
trainer = tr.Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_data, # training dataset
eval_dataset=val_data, # evaluation dataset
compute_metrics=compute_metrics
)
执行后:
模型卡在这一点上:
***** 跑步训练 *****
Num examples = 12981
Num Epochs = 20
Instantaneous batch size per device = 16
Total train batch size (w. parallel, distributed & accumulation) = 32
Gradient Accumulation steps = 1
Total optimization steps = 8120
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"
可能的解决方案是什么?