假设我想在训练期间更新一个预训练的词嵌入矩阵,有没有办法只更新词嵌入矩阵的一个子集?
我查看了 Tensorflow API 页面并发现了这一点:
# Create an optimizer.
opt = GradientDescentOptimizer(learning_rate=0.1)
# Compute the gradients for a list of variables.
grads_and_vars = opt.compute_gradients(loss, <list of variables>)
# grads_and_vars is a list of tuples (gradient, variable). Do whatever you
# need to the 'gradient' part, for example cap them, etc.
capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars]
# Ask the optimizer to apply the capped gradients.
opt.apply_gradients(capped_grads_and_vars)
但是,我如何将其应用于词嵌入矩阵。假设我这样做:
word_emb = tf.Variable(0.2 * tf.random_uniform([syn0.shape[0],s['es']], minval=-1.0, maxval=1.0, dtype=tf.float32),name='word_emb',trainable=False)
gather_emb = tf.gather(word_emb,indices) #assuming that I pass some indices as placeholder through feed_dict
opt = tf.train.AdamOptimizer(1e-4)
grad = opt.compute_gradients(loss,gather_emb)
然后如何使用opt.apply_gradients
和tf.scatter_update
更新原始嵌入矩阵?compute_gradient
(此外,如果 的第二个参数不是 a ,则 tensorflow 会引发错误tf.Variable
)