0

我正在下载模型https://huggingface.co/microsoft/Multilingual-MiniLM-L12-H384/tree/main microsoft/Multilingual-MiniLM-L12-H384 然后使用它。我正在使用BertForSequenceClassification加载模型

https://huggingface.co/docs/transformers/model_doc/bert#:~:text=sentence%20was%20random-,BertForSequenceClassification,-class%20transformers.BertForSequenceClassification

变压器版本:'4.11.3'

我写了下面的代码:

def compute_metrics(eval_pred):
    logits, labels = eval_pred
   

    predictions = np.argmax(logits, axis=-1)
    
    acc = np.sum(predictions == labels) / predictions.shape[0]
    return {"accuracy" : acc}

model = tr.BertForSequenceClassification.from_pretrained("/home/pc/minilm_model",num_labels=2)
model.to(device)

print("hello")

training_args = tr.TrainingArguments(
    output_dir='/home/pc/proj/results2',          # output directory
    num_train_epochs=10,              # total number of training epochs
    per_device_train_batch_size=16,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,
    warmup_steps=1000,                # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=1000,
    evaluation_strategy="epoch",
    save_strategy="no"
)



trainer = tr.Trainer(
    model=model,                         # the instantiated  Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_data,         # training dataset
    eval_dataset=val_data,             # evaluation dataset
    compute_metrics=compute_metrics
)

训练模型后,该文件夹为空。

可以通过 classes=2 进行二进制分类吗?

模型最后一层是简单的线性连接,它给出了 logits 值。如何从中得到它的解释和概率分数?logit分数是否与概率成正比?

model = tr.BertForSequenceClassification.from_pretrained("/home/pchhapolika/minilm_model",num_labels=2)
4

1 回答 1

1

可以通过 classes=2 进行二进制分类吗?

是的。

模型最后一层是简单的线性连接,它给出了 logits 值。如何从中得到它的解释和概率分数?logit分数是否与概率成正比?

它们之间有直接的关系:

probability = softmax(logits, axis=-1)

或相反亦然: logits = log(probability) + const

所以 logits 与概率不成正比,但关系是单调的。

于 2021-12-20T10:34:53.790 回答