问题标签 [attention-model]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

339 问题

0 投票

4 回答

4644 浏览

python - Keras：如何在 LSTM 模型中显示注意力权重

我使用带有注意力层的 LSTM 制作了一个文本分类模型。我的模型做得很好，效果很好，但我无法显示评论（输入文本）中每个单词的注意力权重和重要性/注意力。该模型使用的代码是：

2018-09-03T14:50:36.860

0 投票

0 回答

157 浏览

image - 如何有效地从一批图像中提取子图像，每个子图像位于不同的位置？

我需要从一批（nxn）图像中提取（mxm）子图像，其中：

images.shape = (batch, n, n, n_channels), sub-images.shape = (batch, m, m, n_channels),

并且对于批次中的每个图像，每个子图像位于不同的位置。似乎最接近的例程是： tf.image.crop_to_bounding_box()，但这对批处理中的所有图像使用相同的裁剪参数。有没有一种简单的方法可以做到这一点？我总是可以求助于gather_nd()，但这很麻烦。

image tensorflow image-processing attention-model

2018-09-25T23:44:32.217

0 投票

5 回答

10436 浏览

pytorch - RuntimeError: "exp" not implemented for 'torch.LongTensor'

I am following this tutorial: http://nlp.seas.harvard.edu/2018/04/03/attention.html to implement the Transformer model from the "Attention Is All You Need" paper.

However I am getting the following error : RuntimeError: "exp" not implemented for 'torch.LongTensor'

This is the line, in the PositionalEnconding class, that is causing the error:

When it is being constructed here:

Any ideas?? I've already tried converting this to perhaps a Tensor Float type, but this has not worked.

I've even downloaded the whole notebook with accompanying files and the error seems to persist in the original tutorial.

Any ideas what may be causing this error?

Thanks!

pytorch tensor attention-model

2018-10-22T04:32:00.977

0 投票

2 回答

13669 浏览

python - Keras - 向 LSTM 模型添加注意机制

使用以下代码：

我试图了解如何在第一个 LSTM 层之前添加注意机制。我找到了以下 GitHub：Philippe Rémy 的 keras-attention-mechanism，但无法弄清楚如何将它与我的代码一起使用。

我想可视化注意力机制，看看模型关注的特征是什么。

任何帮助将不胜感激，尤其是代码修改。谢谢：）

python machine-learning keras lstm attention-model

2018-11-05T09:03:40.167

0 投票

2 回答

461 浏览

machine-learning - 什么是用来训练自注意力机制的？

我一直在尝试理解 self-attention，但我发现的所有东西都不能很好地解释这个概念。

假设我们在 NLP 任务中使用自注意力，所以我们的输入是一个句子。

然后self-attention可以用来衡量句子中每个单词对于其他单词的“重要性”。

问题是我不明白如何衡量“重要性”。重要的是什么？

训练自注意力算法中权重的目标向量到底是什么？

machine-learning nlp artificial-intelligence attention-model

2018-11-06T13:05:09.853

0 投票

0 回答

496 浏览

tensorflow - 如何在 Tensorflow 中创建没有输入的 LSTM？

我正在实现一种注意力机制，它使用 LSTM 单元作为编码器，其中 LSTM 不接受/没有任何输入（只有隐藏状态）。如何在 TensorFlow 中指定没有输入的 LSTM？

本质上，隐藏状态将与 LSTM 单元的“入口点”处的空输入连接。

谢谢！

编辑：我在Attention 机制中提出的相关问题，LSTM 不接受任何输入

tensorflow lstm attention-model

2018-11-13T13:16:56.930

0 投票

1 回答

757 浏览

keras - Keras如何为加权和添加注意层

我有以下一个网络架构（下面只显示网络的相关部分）

这里的问题是我不想将权重作为输入传递，但我想“学习”它们，就像在注意力层中一样。

我知道可以通过这种方式创建注意力层

h等于输入长度的维度，我将其设置为 5。但是，如果执行上述操作，则会出现以下错误 TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

我认为这与输入参数的尺寸有关，但我不确定如何解决这个问题。

keras embedding attention-model

2018-11-27T10:45:12.153

0 投票

0 回答

95 浏览

python - ValueError with attention Dimension1 in both shape must be equal

大家好，我有一个问题，我正在使用 python 3.6.5 和 tensorflow 1.8.0。我的输入是 1000 max_textlength * 64 嵌入 * 4 个步骤和 3 个协议 = 64007 神经数 = 10

一个普通的 RNN 可以工作，但我想改进它

attentioncellwrapper(neurons, 2, state_is_tuple = True)

我收到以下消息：

为什么会这样？有没有人也有这个问题？

我也在试验 state_is_tuple = False，没有给出错误信息，但是 python 突然崩溃了 :(

顺便说一句，当我将注意力长度从 2 更改为 3 或 4 时，这更改为

似乎注意力长度与形状相乘非常感谢您的帮助！

python tensorflow error-handling valueerror attention-model

2018-12-13T19:45:31.347

0 投票

1 回答

395 浏览

python - 检查输入时出错：预期 lstm_28_input 的形状为 (5739, 8) 但得到的数组的形状为 (1, 8)

我得到了 keras 尺寸错误

输入形状是这样的

结果

型号如下

结果

但我在合适的步骤中有错误

错误

但我打印输入形状是（5739,8），我不明白（1,8）来自哪里。以及如何解决它。

是 test_X、test_Y 还是 train 中输入形状的问题？我应该如何解决它？

python keras lstm keras-layer attention-model

2018-12-17T12:48:49.317

0 投票

2 回答

10569 浏览

keras - 如何可视化注意力权重？

使用这个实现，我已经包括了对我的 RNN（将输入序列分为两类）的关注，如下所示。

我已经训练了模型并将权重保存到weights.best.hdf5文件中。

我正在处理二进制分类问题，我的模型的输入是一个热向量（基于字符）。

如何可视化当前实现中某些特定测试用例的注意力权重？

keras deep-learning nlp rnn attention-model

2018-12-20T11:00:13.087

1 2 3 4 5 6 7 8 9 10

问题标签 [attention-model]

Reference