python - 使用 CNTK 在每个生成步骤通过采样生成序列

Question

在具有编码器和解码器的 seq2seq 模型中，在每个生成步骤中，softmax 层都会输出整个词汇表的分布。在 CNTK 中，可以使用 C.hardmax 函数轻松实现贪婪解码器。它看起来像这样。

def create_model_greedy(s2smodel):
    # model used in (greedy) decoding (history is decoder's own output)
    @C.Function
    @C.layers.Signature(InputSequence[C.layers.Tensor[input_vocab_dim]])
    def model_greedy(input): # (input*) --> (word_sequence*)
        # Decoding is an unfold() operation starting from sentence_start.
        # We must transform s2smodel (history*, input* -> word_logp*) into a generator (history* -> output*)
        # which holds 'input' in its closure.
        unfold = C.layers.UnfoldFrom(lambda history: s2smodel(history, input) >> **C.hardmax**,
                                     # stop once sentence_end_index was max-scoring output
                                     until_predicate=lambda w: w[...,sentence_end_index],
                                     length_increase=length_increase)
        return unfold(initial_state=sentence_start, dynamic_axes_like=input)
    return model_greedy

但是，在每一步我都不想以最大概率输出令牌。相反，我想要一个随机解码器，它根据词汇的概率分布生成一个标记。

我怎样才能做到这一点？任何帮助表示赞赏。谢谢。

score 3 · Accepted Answer

您可以在采用 hardmax 之前向输出添加噪声。特别是，您可以使用C.random.gumbel或C.random.gumbel_like按比例采样exp(output)。这被称为gumbel-max 技巧。cntk.random模块也包含其他分布，但如果您有对数概率，您很可能希望在 hardmax 之前添加 gumbel 噪声。一些代码：

@C.Function
def randomized_hardmax(x):
    noisy_x = x + C.random.gumbel_like(x)
    return C.hardmax(noisy_x)

然后将您的替换hardmax为randomized_hardmax.

score 0 · Accepted Answer

非常感谢 Nikos Karampatziakis。

如果您想要一个随机采样解码器来生成与目标序列长度相同的序列，则以下代码可以工作。

@C.Function
def sampling(x):
    noisy_x = x + C.random.gumbel_like(x)
    return C.hardmax(noisy_x)

def create_model_sampling(s2smodel):
    @C.Function
    @C.layers.Signature(input=InputSequence[C.layers.Tensor[input_vocab_dim]],
                        labels=LabelSequence[C.layers.Tensor[label_vocab_dim]])
    def model_sampling(input, labels): # (input*) --> (word_sequence*)
        unfold = C.layers.UnfoldFrom(lambda history: s2smodel(history, input) >> sampling,
                                     length_increase=1)
        return unfold(initial_state=sentence_start, dynamic_axes_like=labels)
    return model_sampling

python - 使用 CNTK 在每个生成步骤通过采样生成序列

2 回答 2

Related

Reference