我正在尝试将批量标准化添加到我的 CNN 中,并且已经阅读了很多关于如何做到这一点的帖子,但是当我将训练设置为 False 时,我的实现仍然会产生一个 nans 数组。
即使在测试时间将训练设置为 True,结果也不是 Nan,但如果我在训练图像上进行测试,结果会比训练时间差。
我使用了0.9 的衰减并训练了15 000 次迭代
这是我的图形构建,按照tf.layers.batch_normalization 文档中的建议添加更新操作作为依赖项,然后使用 sess
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
phase_train = tf.placeholder(tf.bool, name='phase_train')
###### Other placeholders and variables declarations ######
# Build a Graph that computes the logits predictions from the inference model.
loss, eval_prediction = inference(train_data_node, train_labels_node, batch_size, phase_train, dropout_out_keep_prob)
# Build a Graph that trains the model with one batch of examples and updates the model parameters.
###### Should I rather put the dependency here ? ######
train_op = train(loss, global_step)
saver = tf.train.Saver(tf.global_variables())
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
# Start the queue runners.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
for step in range(startstep, startstep + max_steps):
feed_dict = fill_feed_dict(train_labels_node, train_data_node, dropout_out_keep_prob, phase_train, batch_size, phase_train_val=True,drop_out_keep_prob_val=1.)
_, loss_value = sess.run([train_op, loss], feed_dict=feed_dict)
这是我的batch_norm 函数调用:
def batch_norm_layer(inputT, is_training, scope):
return tf.layers.batch_normalization(inputT, training=is_training, center=False, reuse=None, momentum=0.9)
现在这里是我如何恢复模型进行测试:
phase_train = tf.placeholder(tf.bool, name='phase_train')
###### Other placeholder definitions ######
loss, logits = inference(test_data_node, test_labels_node, batch_size, phase_train, drop_out_keep_prob)
pred = tf.argmax(logits, dimension=3)
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, test_ckpt)
threads = tf.train.start_queue_runners(sess=sess)
feed_dict = fill_feed_dict(test_labels_node, test_data_node, drop_out_keep_prob, phase_train, batch_size=1, phase_train_val=False, drop_out_keep_prob_val=1.)
pred_loss, dense_prediction, predicted_image = sess.run([loss, logits, pred], feed_dict=feed_dict)
这里dense_prediction 给出了一个Nans 数组,因此predicted_image 全部为0 我的构造是否有错误?我该如何修复它/诊断它?
欢迎任何帮助,我已经阅读了很多使用“手工制作”批处理规范的教程,但我找不到关于如何使用官方批处理规范的良好指导教程,猜测是因为它太明显了,但它是不适合我 !