我想知道 tf.stop_gradient 是否仅停止给定操作的梯度计算,或者停止其输入 tf.variable 的更新?我有以下问题 - 在 MNIST 中的正向路径计算期间,我想对权重执行一组操作(比如说 W 到 W*),然后对输入进行 matmul。但是,我想从反向路径中排除这些操作。我只想在反向传播训练期间计算 dE/dW。我编写的代码阻止 W 更新。你能帮我理解为什么吗?如果这些是变量,我知道我应该将它们的可训练属性设置为 false,但这些是权重操作。如果 stop_gradient 不能用于此目的,那么如何构建两个图,一个用于正向路径,另一个用于反向传播?
def build_layer(inputs, fmap, nscope,layer_size1,layer_size2, faulty_training):
with tf.name_scope(nscope):
if (faulty_training):
## trainable weight
weights_i = tf.Variable(tf.truncated_normal([layer_size1, layer_size2],stddev=1.0 / math.sqrt(float(layer_size1))),name='weights_i')
## Operations on weight whose gradient should not be computed during backpropagation
weights_fx_t = tf.multiply(268435456.0,weights_i)
weight_fx_t = tf.stop_gradient(weights_fx_t)
weights_fx = tf.cast(weights_fx_t,tf.int32)
weight_fx = tf.stop_gradient(weights_fx)
weights_fx_fault = tf.bitwise.bitwise_xor(weights_fx,fmap)
weight_fx_fault = tf.stop_gradient(weights_fx_fault)
weights_fl = tf.cast(weights_fx_fault, tf.float32)
weight_fl = tf.stop_gradient(weights_fl)
weights = tf.stop_gradient(tf.multiply((1.0/268435456.0),weights_fl))
##### end transformation
else:
weights = tf.Variable(tf.truncated_normal([layer_size1, layer_size2],stddev=1.0 / math.sqrt(float(layer_size1))),name='weights')
biases = tf.Variable(tf.zeros([layer_size2]), name='biases')
hidden = tf.nn.relu(tf.matmul(inputs, weights) + biases)
return weights,hidden
我正在使用 tensorflow 梯度下降优化器进行训练。
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)