嗨,我正在尝试使用自定义的损失函数来微调初始网络。它是一个三元组损失函数。
这个函数来自 facenet.py
def triplet_loss(value, alpha):
"""Calculate the triplet loss according to the FaceNet paper
Args:
value: the embeddings for the anchor, positive, negative images.
Returns:
the triplet loss according to the FaceNet paper as a float tensor.
"""
# The following function ensuer, it is evenly divided
anchor, positive, negative = tf.split(value, num_or_size_splits=3, axis=0)
with tf.variable_scope('triplet_loss'):
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1)
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1)
basic_loss = tf.add(tf.subtract(pos_dist, neg_dist), alpha)
loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0)
# TODO: added by me
tf.add_to_collection('losses', loss)
return loss
注意:value参数是softmax之前logits层的输出。
当我计算梯度时,我发现BatchNorm/moving_variance
并BatchNorm/moving_variance
没有梯度。为什么它返回 None 梯度值?
通过可视化,我发现没有从损失到 BatchNorm 范围的数据流,为什么权重有来自损失节点的数据流而 Batchnorm 没有?