0

我正在尝试运行此代码的代码“转换器”版本使用新的预训练 BERTweet 模型,但出现错误。

以下代码行在我的 Google Colab 笔记本中成功运行:


!pip install fairseq
import fairseq
!pip install fastBPE
import fastBPE

# download the pre-trained BERTweet model zipped file
!wget https://public.vinai.io/BERTweet_base_fairseq.tar.gz

# unzip the pre-trained BERTweet model files
!tar -xzvf BERTweet_base_fairseq.tar.gz

!pip install transformers
import transformers

import torch
import argparse

from transformers import RobertaConfig
from transformers import RobertaModel

from fairseq.data.encoders.fastbpe import fastBPE
from fairseq.data import Dictionary

然后我尝试运行以下代码:

# Load model
config = RobertaConfig.from_pretrained(
    "/Absolute-path-to/BERTweet_base_transformers/config.json"
)
BERTweet = RobertaModel.from_pretrained(
    "/Absolute-path-to/BERTweet_base_transformers/model.bin",
    config=config
)

...并显示错误:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    242             if resolved_config_file is None:
--> 243                 raise EnvironmentError
    244             config_dict = cls._dict_from_json_file(resolved_config_file)

OSError: 

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
2 frames
/usr/local/lib/python3.6/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
    250                 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n"
    251             )
--> 252             raise EnvironmentError(msg)
    253 
    254         except json.JSONDecodeError:

OSError: Can't load config for '/Absolute-path-to/BERTweet_base_transformers/config.json'. Make sure that:

- '/Absolute-path-to/BERTweet_base_transformers/config.json' is a correct model identifier listed on 'https://huggingface.co/models'

- or '/Absolute-path-to/BERTweet_base_transformers/config.json' is the correct path to a directory containing a config.json file

我猜问题是我需要用其他东西替换“/Absolute-path-to”,但如果是这样的话,应该用什么替换?这可能是一个非常简单的答案,我觉得问起来很愚蠢,但我需要帮助。

4

1 回答 1

2

首先,您必须按照 github 自述文件中的说明下载正确的包:

!wget https://public.vinai.io/BERTweet_base_transformers.tar.gz

!tar -xzvf BERTweet_base_transformers.tar.gz

之后,您可以单击目录图标(屏幕左侧)并列出下载的数据: colab 文件夹

右键单击 BERTweet_base_transformers,选择copy path剪贴板中的内容并将其插入到代码中:

config = RobertaConfig.from_pretrained(
    "/content/BERTweet_base_transformers/config.json"
)

BERTweet = RobertaModel.from_pretrained(
    "/content/BERTweet_base_transformers/model.bin",
    config=config
)
于 2020-06-16T12:15:27.243 回答