“torchtext”的相关标签问题

0 投票

1 回答

825 浏览

python - 将值转换为 ids 的 torchtext 字段得到错误整数是必需的

我按照本教程 http://www.programmersought.com/article/2609385756/

使用已标记化并转换为 id 的数据创建 TabularDataset，我不想使用 vocab 或构建 vocab，因为数据是数字的

所以我将我的字段变量定义为：

火车输出：

我使用了 BucketIterator：

当我运行此代码时：

我得到 TypeError: an integer is required (得到类型列表)

TypeError Traceback (最近一次调用最后一次) in () ----> 1 batch = next(iter(train_iter))

3 帧 /usr/local/lib/python3.6/dist-packages/torchtext/data/iterator.py in iter (self) 155 else: 156 minibatch.sort(key=self.sort_key, reverse=True) --> 157 yield Batch(minibatch, self.dataset, self.device) 158 if not self.repeat: 159 return

/usr/local/lib/python3.6/dist-packages/torchtext/data/batch.py in init (self, data, dataset, device) 32 如果字段不是 None: 33 batch = [getattr(x, name) for x in data] ---> 34 setattr(self, name, field.process(batch, device=device)) 35 36 @classmethod

/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py in process(self, batch, device) 199 """ 200 padded = self.pad(batch) --> 201 tensor = self.numericalize(padded, device=device) 202 返回张量 203

/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py in numericize(self, arr, device) 321 arr = self.postprocessing(arr, None) 322 --> 323 var = torch .tensor(arr, dtype=self.dtype, device=device) 324 325 如果 self.sequential 而不是 self.batch_first：

TypeError：需要一个整数（获取类型列表）

0 投票

1 回答

323 浏览

pytorch - torchtext 数据 build_vocab / data_field

我想问你一些关于torchtext的问题。

我有一个关于抽象文本摘要的任务，我用 pytorch 构建了一个 seq2seq 模型。

我只是想知道torchtext中build_vocab函数构造的data_field。

在机器翻译中，我接受需要两个 data_fields(input, output)。

但是，总而言之，输入数据和输出数据是相同的语言。

在这里，我应该在这里创建两个 data_field(full_sentence, abstract_sentence) 吗？

还是可以只使用一个data_field？

恐怕我的错误选择会使模型的性能下降。

请给我一个提示。

pytorch torchtext

0 投票

1 回答

3185 浏览

python - 溢出错误：Python int 太大而无法转换为 C long torchtext.datasets.text_classification.DATASETS['AG_NEWS']()

我有 64 位 Windows 10 操作系统我已经安装了 python 3.6.8 我已经使用 pip 安装了 torch 和 torchtext。火炬版本是 1.2.0

我正在尝试使用以下代码加载 AG_NEWS 数据集：

在上面代码的最后一条语句中，我收到以下错误：

我认为问题出在 windows os 或 torchtext 上，因为下面的代码也出现了同样的错误。

有人可以帮忙吗？主要是我在文件中没有任何大的数值。

python windows pandas pytorch torchtext

0 投票

2 回答

3303 浏览

python - 了解 TypeError：“示例”和“示例”的实例之间不支持“<”

我正在使用多头注意力转换器模型进行文本简化项目。同样，我使用 torchtext 进行标记化和数字化。数据集包含两个用于训练的对齐文件和两个用于测试的对齐文件。在训练文件中，一个文件包含复杂句子，而另一个文件包含相应的简化句子。

我是这样阅读文件的：

接下来，我将它们标记为：

然后我转换成 torchtext 的 TabularDataset 对象。

然后创建词汇

但是，这样做我得到了这个错误：

TypeError：“示例”和“示例”的实例之间不支持“<”

在搜索时，我在这里遇到了这个解决方案，错误消失了。但是，我不明白这是否使模型只采用一个实例还是采用所有数据集？我想知道索引的重要性，[0]以便我可以为我的模型有效地操纵它。

python nlp pytorch torchtext

0 投票

2 回答

1174 浏览

python - How to make prediction from train Pytorch and PytorchText model?

General speaking, after I have successfully trained a text RNN model with Pytorch, using PytorchText to leverage data loading on an origin source, I would like to test with other data sets (a sort of blink test) that are from different sources but the same text format.

First I defined a class to handle the data loading.

Here is the detail of load_data which I load data that trained successfully.

Next is my code (load_data_but_error) to load others source but causing error

When I was executing code, I had an error AttributeError: 'Field' object has no attribute 'vocab' which has a question at here but it doesn't like my situation as here I had vocab from load_data and I want to use it for blink tests.

My question is what the correct way to load and feed new data with a trained PyTorch model for testing current model is?

python nlp pytorch torchtext

0 投票

1 回答

1625 浏览