我正在审查拥抱脸版的阿尔伯特。
但是,我找不到任何关于 SOP 的代码或评论。
我可以从modeling_from src/transformers/modeling_bert.py找到NSP(下一句预测)实现。
if masked_lm_labels is not None and next_sentence_label is not None:
loss_fct = CrossEntropyLoss()
masked_lm_loss = loss_fct(prediction_scores.view(-1, self.config.vocab_size), masked_lm_labels.view(-1))
next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
total_loss = masked_lm_loss + next_sentence_loss
outputs = (total_loss,) + outputs
SOP 是从这里继承的,带有 SOP 风格的标签吗?或者我有什么遗漏吗?