我的神经网络必须通过自定义三元组损失来学习图像相似性。正面图像类似于锚点,而负面图像则不然。
我的任务是预测看不见的三元组的第二张图像或第三张图像是否与锚更相似。
任务中的训练集和测试集都给出了三元组,因此我不必挖掘它们或随机生成它们:它们在我的任务中是固定的。
---> 想法:为了改进我的模型,我尝试使用冻结 Xception 层并在顶部添加 Dense 层的特征学习。
问题:
在冻结 Xception 层的情况下训练下面的模型时,在 1-2 个 epoch 之后,它学会将所有正图像设置为与锚点的距离非常低,将所有负图像设置为非常高的距离。因此,100% 的 val 准确度。
我立即想到了过度拟合,但我只有一个完全连接的层可以训练?我该如何对抗这个?还是我的三元组损失以某种方式错误定义?
我不使用数据增强,所以这可能有帮助吗?
不知何故,只有在使用预训练模型时才会发生这种情况。当我使用一个简单的模型时,我得到了真实的准确性......
我在这里想念什么?
我的三胞胎损失:
def triplet_loss(y_true, y_pred, alpha = 0.4):
"""
Implementation of the triplet loss function
Arguments:
y_true -- true labels, required when you define a loss in Keras, you don't need it in this function.
y_pred -- python list containing three objects:
anchor -- the encodings for the anchor data
positive -- the encodings for the positive data (similar to anchor)
negative -- the encodings for the negative data (different from anchor)
Returns:
loss -- real number, value of the loss
"""
total_length = y_pred.shape.as_list()[-1]
anchor = y_pred[:,0:int(total_length*1/3)]
positive = y_pred[:,int(total_length*1/3):int(total_length*2/3)]
negative = y_pred[:,int(total_length*2/3):int(total_length*3/3)]
# distance between the anchor and the positive
pos_dist = K.sum(K.square(anchor-positive),axis=1)
# distance between the anchor and the negative
neg_dist = K.sum(K.square(anchor-negative),axis=1)
# compute loss
basic_loss = pos_dist-neg_dist+alpha
loss = K.maximum(basic_loss,0.0)
return loss
然后我的模型:
def baseline_model():
input_1 = Input(shape=(256, 256, 3))
input_2 = Input(shape=(256, 256, 3))
input_3 = Input(shape=(256, 256, 3))
pretrained_model = Xception(include_top=False, weights="imagenet")
for layer in pretrained_model.layers:
layer.trainable = False
x1 = pretrained_model(input_1)
x2 = pretrained_model(input_2)
x3 = pretrained_model(input_3)
x1 = Flatten(name='flatten1')(x1)
x2 = Flatten(name='flatten2')(x2)
x3 = Flatten(name='flatten3')(x3)
x1 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x1)
x2 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x2)
x3 = Dense(128, activation=None,kernel_regularizer=l2(0.01))(x3)
x1 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x1)
x2 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x2)
x3 = Lambda(lambda x: K.l2_normalize(x,axis=-1))(x3)
concat_vector = concatenate([x1, x2, x3], axis=-1, name='concat')
model = Model([input_1, input_2, input_3], concat_vector)
model.compile(loss=triplet_loss, optimizer=Adam(0.00001), metrics=[accuracy])
model.summary()
return model
适合我的模型:
model.fit(
gen(X_train,batch_size=batch_size),
steps_per_epoch=13281 // batch_size,
epochs=10,
validation_data=gen(X_val,batch_size=batch_size),
validation_steps=1666 // batch_size,
verbose=1,
callbacks=callbacks_list
)
model.save_weights('try_6.h5')