from transformers import BertTokenizer, BertForMaskedLM
import torch
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze(0) # Batch size 1
outputs = model(input_ids, masked_lm_labels=input_ids)
loss, prediction_scores = outputs[:2]
此代码来自拥抱脸变压器页面。https://huggingface.co/transformers/model_doc/bert.html#bertformaskedlm
我无法理解 中的masked_lm_labels=input_ids
论点model
。它是如何工作的?是不是表示通过时会自动屏蔽部分文字input_ids
?