python - 如何使用 Keras 构建字符级连体网络

Question

我正在尝试使用 Keras 在字符级别上构建连体神经网络，以了解两个名称是否相似。

所以我的两个输入 X1和X2是一个 3-D 矩阵：
X[number_of_cases, max_length_of_name, total_number_of_chars_in_DB]

在实际情况下：

number_of_cases = 5000
max_length_of_name = 50
total_number_of_chars_in_DB = 38

我有一个大小为y[number_of_cases]的输出二进制矩阵。

例如： print(X1[:3, :2])

将给出以下结果：

[[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.
   0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]]

我使用以下代码来构建我的模型：

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k

input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))

lstm1 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))
lstm2 = Bidirectional(LSTM(256, input_shape=(X1.shape[1], X1.shape[2],), return_sequences=False))

l1_norm = lambda x: 1 - k.abs(x[0] - x[1])

merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])

predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)

model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])

model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)

我收到以下错误：

使用不是符号张量的输入调用层 L1_distance。

我认为错误是我需要告诉 L1_distance 层使用前面两个 LSTM 层的输出，但我不知道该怎么做。

第二个问题是，即使在字符级网络的场景中，我是否必须在 LSTM 之前添加一个嵌入层？

谢谢你。

score 1 · Accepted Answer

您的模型输入为[input_1, input_2]，输出为predictions。但是input_1andinput_2没有连接到lstm1and lstm2，所以模型的输入层没有连接到输出层，这就是你得到错误的原因。

试试这个：

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM, SimpleRNN
from keras.models import Model
import keras
from keras import backend as k

input_1 = Input(shape=(X1.shape[1], X1.shape[2],))
input_2 = Input(shape=(X2.shape[1], X2.shape[2],))

lstm1 = Bidirectional(LSTM(256, return_sequences=False))(input_1)
lstm2 = Bidirectional(LSTM(256, return_sequences=False))(input_2)

l1_norm = lambda x: 1 - k.abs(x[0] - x[1])

merged = Lambda(function=l1_norm, output_shape=lambda x: x[0], name='L1_distance')([lstm1, lstm2])

predictions = Dense(1, activation = 'sigmoid', name='classification_layer')(merged)

model = Model([input_1, input_2], predictions)
model.compile(loss = 'binary_crossentropy', optimizer="adam", metrics=["accuracy"])

model.fit([X1, X2], validation_split=0.1, epochs = 20,shuffle=True, batch_size = 256)

python - 如何使用 Keras 构建字符级连体网络

1 回答 1

Related

Reference