1

我使用 Keras(使用 ktrain)微调了一个拥抱脸转换器,然后在 Pytorch 中重新加载了模型。

我想访问倒数第三层(pre_classifier),所以我删除了最后两层:

BERT2 = torch.nn.Sequential(*(list(BERT.children())[:-2])) 

通过它运行一个编码的句子会产生以下错误消息:

AttributeError                            Traceback (most recent call last)
<ipython-input-38-640702475573> in <module>
----> 1 ans2=BERT2(torch.tensor([e1]))
      2 print (ans2)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\linear.py in forward(self, input)
     85 
     86     def forward(self, input):
---> 87         return F.linear(input, self.weight, self.bias)
     88 
     89     def extra_repr(self):

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py in linear(input, weight, bias)
   1366         - Output: :math:`(N, *, out\_features)`
   1367     """
-> 1368     if input.dim() == 2 and bias is not None:
   1369         # fused op is marginally faster
   1370         ret = torch.addmm(bias, input, weight.t())

AttributeError: 'tuple' object has no attribute 'dim'

同时完全删除分类器(所有三层)

BERT3 = torch.nn.Sequential(*(list(BERT.children())[:-3])) 

产生具有预期形状 ( ) 的预期张量(在大小 1 元组内[sentence_num,token_num,768])。

为什么移除两层(而不是三层)会破坏模型?以及如何访问pre_classifier结果?

无法通过设置访问它configoutput_hidden_states=True因为此标志返回 BERT 转换器堆栈的隐藏值,而不是其下游分类器层的隐藏值。

--

附言

用于初始化 BERT 模型的代码:

def collect_data_for_FT():

    from sklearn.datasets import fetch_20newsgroups
    train_data = fetch_20newsgroups(subset='train', shuffle=True, random_state=42)
    test_data =  fetch_20newsgroups(subset='test', shuffle=True, random_state=42)

    print('size of training set: %s' % (len(train_b['data'])))
    print('size of validation set: %s' % (len(test_b['data'])))
    print('classes: %s' % (train_b.target_names))

    x_train = train_data.data
    y_train = train_data.target
    x_test = test_data.data
    y_test = test_data.target

    return(x_train,y_train,x_test,y_test)


 bert_name = 'distilbert-base-uncased'
    from transformers import DistilBertForSequenceClassification,AutoConfig,AutoTokenizer
import os
dir_path = os.getcwd()
dir_path=os.path.join(dir_path,'models')

config = AutoConfig.from_pretrained(bert_name,num_labels=20) # change model configuration to access hidden values.

try:
    BERT = DistilBertForSequenceClassification.from_pretrained(dir_path,config=config)
    print ("Finetuned predictor loaded")
except:
    import tensorflow.keras as keras
    print ("No finetuned predictor found.\nTraining.")
    (x_train,y_train,x_test,y_test)=collect_data_for_FT()
    ####
    # prework:
    import ktrain
    from ktrain import text
    t = text.Transformer(bert_name, maxlen=500, classes=train_b.target_names)
    trn = t.preprocess_train(x_train, y_train)
    val = t.preprocess_test(x_test, y_test)
    pre_trained_model = t.get_classifier()
    learner = ktrain.get_learner(pre_trained_model, train_data=trn, val_data=val, batch_size=6)    
    ####

    ####
    # Find best learning rate
    learner.lr_find()
    learner.lr_plot()
    ####

    learner.fit_onecycle(2e-4, 4) # choosen based on the learning rate/loss plot.

    ####
    # prepare and save:
    predictor = ktrain.get_predictor(learner.model, preproc=t)
    predictor.save('my_distilbertbase_predictor')
    predictor.model.save_pretrained(dir_path)
    ####
    BERT = DistilBertForSequenceClassification.from_pretrained(os.path.join(dir_path), from_tf=True,config=config) # re-load tensorflow to pytorch
    BERT.save_pretrained(dir_path) # save as a "full blooded" pytorch model
    BERT = DistilBertForSequenceClassification.from_pretrained(dir_path,config=config)  # re-load
    from tensorflow.keras import backend as K
    K.clear_session() # loading from tensorflow takes up space and the GPU. This releases it/
4

0 回答 0