我在我的深度神经网络中成功使用了 tensorflow 批量归一化。我正在按照以下方式进行操作:
if apply_bn:
with tf.variable_scope('bn'):
beta = tf.Variable(tf.constant(0.0, shape=[out_size]), name='beta', trainable=True)
gamma = tf.Variable(tf.constant(1.0, shape=[out_size]), name='gamma', trainable=True)
batch_mean, batch_var = tf.nn.moments(z, [0], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.5)
def mean_var_with_update():
ema_apply_op = ema.apply([batch_mean, batch_var])
with tf.control_dependencies([ema_apply_op]):
return tf.identity(batch_mean), tf.identity(batch_var)
mean, var = tf.cond(self.phase_train,
mean_var_with_update,
lambda: (ema.average(batch_mean), ema.average(batch_var)))
self.z_prebn.append(z)
z = tf.nn.batch_normalization(z, mean, var, beta, gamma, 1e-3)
self.z.append(z)
self.bn.append((mean, var, beta, gamma))
它适用于训练和测试阶段。但是,当我尝试在另一个项目中使用计算出的神经网络参数时遇到问题,我需要自己计算所有矩阵乘法和东西。问题是我无法重现该tf.nn.batch_normalization
函数的行为:
feed_dict = {
self.tf_x: np.array([range(self.x_cnt)]) / 100,
self.keep_prob: 1,
self.phase_train: False
}
for i in range(len(self.z)):
# print 0 layer's 1 value of arrays
print(self.sess.run([
self.z_prebn[i][0][1], # before bn
self.bn[i][0][1], # mean
self.bn[i][1][1], # var
self.bn[i][2][1], # offset
self.bn[i][3][1], # scale
self.z[i][0][1], # after bn
], feed_dict=feed_dict))
# prints
# [-0.077417567, -0.089603029, 0.000436493, -0.016652612, 1.0055743, 0.30664611]
根据页面https://www.tensorflow.org/versions/r1.2/api_docs/python/tf/nn/batch_normalization上的公式:
bn = scale * (x - mean) / (sqrt(var) + 1e-3) + offset
但正如我们所见,
1.0055743 * (-0.077417567 - -0.089603029)/(0.000436493^0.5 + 1e-3) + -0.016652612
= 0.543057
这与0.30664611
由 Tensorflow 本身计算的 value 不同。那么我在这里做错了什么,为什么我不能自己计算批量标准化值?
提前致谢!