“roberta-language-model”的相关标签问题

0 投票

2 回答

200 浏览

python - FileNotFound 错误下载 roberta-model 句子转换器

我已经下载了“roberta-large-nli-stsb-mean-tokens”模型，但它开始一次又一次地下载。注意：这与空间无关，机器有空间。这个错误来了...... FileNotFoundError

2021-05-06T14:37:20.897

0 投票

0 回答

624 浏览

nlp - 如何使用 RoBERTa ONNX 量化模型执行批量推理？

我已将 RoBERTa PyTorch 模型转换为 ONNX 模型并对其进行量化。我能够从 ONNX 模型中获得单个输入数据点（每个句子）的分数。我想了解如何通过将多个输入传递给会话来使用 ONNX 运行时推理会话进行批量预测。下面是示例场景。

模型：roberta-quant.onnx，它是 RoBERTa PyTorch 模型的 ONNX 量化版本

用于将 RoBERTa 转换为 ONNX 的代码：

向 ONNXRuntime 推理会话输入样本：

使用 ONNXRuntime 推理会话为 400 个数据样本（句子）运行 ONNX 模型：

在上面的代码中，我依次循环遍历 400 个句子以获得分数“ ort_outputs”。请帮助我了解如何使用 ONNX 模型在此处执行批处理，我可以在其中发送多个句子的 and 并inputs_ids获取.attention_masksort_outputs

提前致谢！

nlp batch-processing onnx onnxruntime roberta-language-model

2021-06-11T05:15:14.490

0 投票

0 回答

83 浏览

python - 错误：AssertionError：无法计算输出张量（“dense/Softmax:0”，shape=（None，3），dtype=float32）

我正在使用变压器库来运行变压器模型（roberta-large-mnli）：

这就是我开始培训过程的方式：

training是一个张量流数据集 ( tensorflow.python.data.ops.dataset_ops.PrefetchDataset)。

当我运行此代码时，我收到以下错误：

如果我换行：

至

一切正常。

python tensorflow bert-language-model huggingface-transformers roberta-language-model

2021-06-21T18:31:19.270

0 投票

1 回答

41 浏览

python - 使用 roberta 模型无法定义模型 .compile 或 summary

使用 roberta 模型进行情感分析无法定义模型 .compile 或 summary

我收到这些错误 'RobertaForSequenceClassification' 对象没有属性 'summary' 'RobertaForSequenceClassification' 对象没有属性 'compile'

python keras deep-learning bert-language-model roberta-language-model

2021-06-27T14:43:27.440

0 投票

1 回答

159 浏览

tensorflow - 从罗伯塔模型中绘制混淆矩阵

我使用 Roberta 模型编写了包含两个类的文本分类代码，现在我想绘制混淆矩阵。如何根据罗伯塔模型绘制混淆矩阵？

tensorflow nlp huggingface-transformers confusion-matrix roberta-language-model

2021-07-04T14:03:53.717

0 投票

0 回答

186 浏览

python - 使用 PyTorch 加载 RoBerta 时如何修复严格错误

有关如何解决此问题的任何提示？尝试在此处遵循基本的火炬指南：https ://pytorch.org/hub/pytorch_fairseq_roberta/ 但遇到此错误：

包版本

antlr4-python3-runtime 4.8 argon2-cffi 20.1.0 async-generator 1.10 attrs 21.2.0 backcall 0.2.0 漂白剂 3.3.0 brotlipy 0.7.0 certifi 2021.5.30 cffi 1.14.3 chardet 3.0.4 conda 4.10.3 conda-包处理 1.7.2 密码学 3.2.1 Cython 0.29.24 装饰器 5.0.9 defusedxml 0.7.1 入口点 0.3 hydra-core 1.1.0 idna 2.10 importlib-metadata 3.10.0 importlib-resources 5.2.0 ipykernel 5.3.4 ipython 7.22 .0 ipython-genutils 0.2.0 ipywidgets 7.6.3 jedi 0.17.0 Jinja2 3.0.1 jsonschema 3.2.0 jupyter 1.0。0 jupyter-client 6.1.12 jupyter-console 6.4.0 jupyter-core 4.7.1 jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 MarkupSafe 2.0.1 misune 0.8.4 nbclient 0.5.3 nbconvert 6.1.0 nbformat 5.1。 3 nest-asyncio 1.5.1 notebook 6.4.0 numpy 1.21.0 omegaconf 2.1.0 包装 21.0 pandas 1.3.0 pandocfilters 1.4.3 parso 0.8.2 pexpect 4.8.0 pickleshare 0.7.5 pip 20.2.4 prometheus-client 0.11。 0 提示工具包 3.0.17 ptyprocess 0.7.0 pycosat 0.6.3 pycparser 2.20 Pygments 2.9.0 pyOpenSSL 19.1.0 pyparsing 2.4。7 pyrsistent 0.17.3 PySocks 1.7.1 python-dateutil 2.8.1 pytz 2021.1 PyYAML 5.4.1 pyzmq 20.0.0 qtconsole 5.1.0 QtPy 1.9.0 regex 2021.7.6 请求 2.24.0 ruamel-yaml 0.15.87 Send2Trash 1. 0 setuptools 50.3.1.post20201107 sip 4.19.13 六个 1.15.0 终端 0.9.4 测试路径 0.5.0 火炬 1.9.0 龙卷风 6.1 tqdm 4.51.0 traitlets 5.0.5 打字扩展 3.10.0.0 urllib3 1.25.11 wcwidth 0.2 5 webencodings 0.5.1 wheel 0.35.1 widgetsnbextension 3.5.1 xlrd 2.0。1 个拉链 3.5.0

python pytorch roberta-language-model

2021-07-20T00:41:31.537

0 投票

0 回答

45 浏览

bert-language-model - 罗伯塔为不同的任务重新调整微调模型

我有一个用于二进制分类任务的微调 xlm-roberta-base 模型，如下所示：

model = XLMRobertaForSequenceClassification.from_pretrained( "xlm-roberta-base", num_labels=2, )

我想使用 maskedlm 重新训练模型，其中输入和标签都是句子，然后再次训练它以完成二进制分类任务，但我不知道它是否可能以及这样做的语法是什么。现在我无法将我的 XLMRobertaForSequenceClassification 加载到 maskedLM 模型中。

model = RobertaForMaskedLM.from_pretrained("xlm-roberta-base") device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") print("device is ", device) model.load_state_dict(torch.load('fine_tuned_model.pt', map_location=torch.device('cpu')))

任何帮助表示赞赏。谢谢

bert-language-model huggingface-transformers pre-trained-model roberta-language-model

2021-07-22T11:15:49.500

0 投票

0 回答

32 浏览

nlp - 在 BertTokenizerFast 中加载自我训练的 tokenzier 失败

我训练了一个标记器，如下所示，

然后我尝试加载它：

它失败了，但如果我使用它会起作用：

我不完全理解这里发生了什么。

nlp tokenize bert-language-model roberta-language-model

2021-07-24T05:49:17.820

0 投票

1 回答

123 浏览

huggingface-transformers - 为什么用英语训练并应用于孟加拉语的 BPE 编码不返回未知标记？

我使用在英语数据上训练的 roberta-base 标记tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base',add_prefix_space=True)器来标记孟加拉语，只是为了看看它的行为。当我尝试对孟加拉语字符进行编码时tokenizer.encode('বা')，我得到[0, 1437, 35861, 11582, 35861, 4726, 2]这意味着它在词汇表中找到了一些与孟加拉语字符匹配的标记，即使是用英语训练也是如此。在进一步探索中，我发现这些都是特殊字符['<s>', 'Ġ', 'à¦', '¬', 'à¦', '¾', '</s>']。我的问题是为什么会发生，当应用于新语言时不应该输出未知标记吗？非常感谢任何帮助

huggingface-transformers huggingface-tokenizers roberta-language-model

2021-09-07T14:14:41.703

0 投票

2 回答

885 浏览

python - _batch_encode_plus() 得到了一个意外的关键字参数“return_attention_masks”

我正在研究 RoBERTA 模型来检测推文中的情绪。在谷歌 colab 上。按照 Kaggle 的这个笔记本文件 - https://www.kaggle.com/ishivinal/tweet-emotions-analysis-using-lstm-glove-roberta?scriptVersionId=38608295

代码片段：

在 regular_encode 部分我收到以下错误：

python nlp google-colaboratory bert-language-model roberta-language-model

2021-09-29T09:02:57.527

问题标签 [roberta-language-model]

Reference