我创建了一个带有分类标签的以下连体神经网络
EXAMPLES=10000
FEATURES=30
LEFT=np.random.random((EXAMPLES,FEATURES))
RIGHT=np.random.random((EXAMPLES,FEATURES))
LABELS=[]
for i in range(EXAMPLES):
LABELS.append(np.random.randint(0,2))
LABELS=np.asarray(LABELS)
LABELSOFTMAX=to_categorical(LABELS)
def cosine_distance(vecs):
#I'm not sure about this function too
y_true, y_pred = vecs
y_true = K.l2_normalize(y_true, axis=-1)
y_pred = K.l2_normalize(y_pred, axis=-1)
return K.mean(1 - K.sum((y_true * y_pred), axis=-1))
def cosine_dist_output_shape(shapes):
shape1, shape2 = shapes
print((shape1[0], 1))
return (shape1[0], 1)
inputShape=Input(shape=(FEATURES,))
left_input = Input(shape=(FEATURES,))
right_input = Input(shape=(FEATURES,))
首次实施
model = Sequential()
model.add(Dense(20, activation='relu', input_shape=(30,)))
model.add(BatchNormalization())
model.add(Dense(10, activation='relu'))
encoded_l = model(left_input)
encoded_r = model(right_input)
L1_Distance = Lambda(cosine_distance, output_shape=cosine_dist_output_shape)([encoded_l, encoded_r])
siamese_net = Model([left_input, right_input], L1_Distance)
siamese_net.summary()
siamese_net.compile(loss="binary_crossentropy",optimizer=Adam(lr=0.0001))
siamese_net.fit(x=[LEFT,RIGHT],y=LABELS,batch_size=64,epochs=100)
第一个模型总结
(None, 1)
Model: "model_28"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_132 (InputLayer) (None, 30) 0
__________________________________________________________________________________________________
input_133 (InputLayer) (None, 30) 0
__________________________________________________________________________________________________
sequential_44 (Sequential) (None, 10) 910 input_132[0][0]
input_133[0][0]
__________________________________________________________________________________________________
lambda_37 (Lambda) (None, 1) 0 sequential_44[1][0]
sequential_44[2][0]
==================================================================================================
Total params: 910
Trainable params: 870
Non-trainable params: 40
二次实施
model = Sequential()
model.add(Dense(20, activation='relu', input_shape=(30,)))
model.add(BatchNormalization())
model.add(Dense(10, activation='relu'))
#model.add(Dense(30, activation='relu'))
encoded_l = model(left_input)
encoded_r = model(right_input)
L1_Layer = Lambda(cosine_distance, output_shape=cosine_dist_output_shape)([encoded_l, encoded_r])
L1_Diatance = L1_layer([encoded_l, encoded_r])
prediction = Dense(2,activation='softmax')(L1_Diatance)
siamese_net = Model([left_input, right_input], prediction)
siamese_net.compile(loss="binary_crossentropy",optimizer=Adam(lr=0.001))
siamese_net.summary()
第二个模型总结
Model: "model_29"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_135 (InputLayer) (None, 30) 0
__________________________________________________________________________________________________
input_136 (InputLayer) (None, 30) 0
__________________________________________________________________________________________________
sequential_45 (Sequential) (None, 10) 910 input_135[0][0]
input_136[0][0]
__________________________________________________________________________________________________
lambda_19 (Lambda) multiple 0 sequential_45[1][0]
sequential_45[2][0]
__________________________________________________________________________________________________
dense_140 (Dense) (None, 2) 22 lambda_19[10][0]
==================================================================================================
Total params: 932
Trainable params: 892
Non-trainable params: 40
所以我的问题是哪个是正确和更好的实现,因为两者都工作正常,它们与这些模型中的任何一个是否存在一些微妙的问题,因为我相信余弦相似层只会给出一个缩放器张量,这在这种情况下让我感到困惑?