我使用 doc2vec 特征向量来训练 MLP,但训练 MLP 总是等于 0 每个 epoch。MLP模型是:
def MySimpleMLP(X_train=None):
lengths = sorted([len(X) for X in X_train])
percentile = 0.90
seq_cutoff = lengths[int(len(lengths) * percentile)]
vocab = 50
EMBEDDING_DIM = 50
N = 256
size = 3
seq_indices = Input(shape=(seq_cutoff,), name='seq_input')
seq_embedded = Embedding(input_dim=vocab + 1, output_dim=EMBEDDING_DIM,
input_length=seq_cutoff)(seq_indices)
seq_conv = Conv1D(N, size, activation='relu')(Dropout(0.2)(seq_embedded))
max_conv = GlobalMaxPooling1D()(seq_conv)
hidden_repr = Dense(N, activation='relu')(max_conv)
sentiment = Dense(1, activation='sigmoid')(Dropout(0.2)(hidden_repr))
model = Model(inputs=[seq_indices], outputs=[sentiment])
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
训练数据为:
Doc2VecArrayFilePath = "../data/mpk/tools/Doc2VecArray.pkl"
with open(Doc2VecArrayFilePath, "rb+") as f:
X_train, Y_train = pickle.load(f)
X_train = [25000,50]
[[ 1.54979062 0.99996233 0.10931063 ... -1.12303877 -1.30322146
-0.57274193]
[ 0.90919989 -1.39264524 -1.69380188 ... -0.35270166 1.00891471
1.25304925]
[-0.66494519 0.76236057 1.37783039 ... 0.69574219 1.99134898
-0.38097638]
...
[ 1.08792138 0.00841406 -0.27354664 ... -0.18176237 0.76443428
0.67993295]
[ 0.78027207 -0.80181849 -1.21321726 ... -0.14031847 0.55475223
-0.01875231]
[ 0.59591568 -0.57823026 -0.91873246 ... -0.22376266 1.16658998
-0.02456926]]
25000
50
MLP训练结果为:
Epoch 2/10
- 2s - loss: 0.6586 - accuracy: 0.6250 - val_loss: 0.9066 - val_accuracy: 0.0000e+00
Epoch 3/10
- 2s - loss: 0.6588 - accuracy: 0.6250 - val_loss: 0.8222 - val_accuracy: 0.0000e+00
Epoch 4/10
- 2s - loss: 0.6582 - accuracy: 0.6250 - val_loss: 0.9356 - val_accuracy: 0.0000e+00
Epoch 5/10
- 2s - loss: 0.6563 - accuracy: 0.6250 - val_loss: 0.8692 - val_accuracy: 0.0000e+00
其余的代码是:
mlp = MySimpleMLP(X_train=X_train)
mlp.fit(np.array(X_train, dtype='int32'), Y_train, validation_split=0.2, epochs=10, batch_size=64, verbose=2)
如何修改 MLP 模型以准确地接受 doc2vec 输入?请帮助。