我一直在使用 Elmo 在 IMDB 数据集上嵌入 250 个句子的矩阵。应用 Elmo 后,我得到了一个维度数组(250、1024)。生成 Elmo 矩阵的代码如下:
def getElmo(elmo,x):
embed=elmo(x,signature="default",as_dict=True)["elmo"]
with tf.compat.v1.Session() as ses:
ses.run(tf.compat.v1.global_variables_initializer())
ses.run(tf.compat.v1.tables_initializer())
return ses.run(tf.reduce_mean(embed,1))
listTrain=[X_train[i:i+10] for i in range(0,X_train.shape[0],10)]
listTrainT=[X_trainT[i:i+10] for i in range(0,X_trainT.shape[0],10)]
elmoTrain=[getElmo(elmo,i)for i in listTrain]
我将矩阵保存到一个泡菜文件中,当我打开它时,它具有以下数据和尺寸:
[[ 0.01701645 -0.08691402 0.01426436 ... 0.01268931 0.09302917
0.00627212]
[-0.01594155 -0.11892342 0.03353928 ... 0.00920583 0.07130951
-0.00491649]
[ 0.02554347 -0.04609346 0.03476872 ... -0.019207 0.13730985
0.01841145]
...
[ 0.23489136 -0.24797124 0.03176903 ... -0.16736303 0.46603483
0.07271077]
[-0.03589417 -0.10203484 -0.07184037 ... -0.05782426 0.21744442
0.0481869 ]
[ 0.06803051 -0.11667343 0.00658324 ... -0.07491366 0.12236159
0.00994192]]
(250, 1024)
我用了250个句子。我的 LSTM 模型如下:
model=Sequential()
maxlen=1024
#vocab size equals 250
embedding_layer=Embedding(vocab_size,1024,weights=[embedM],input_length=maxlen,trainable=False) model.add(embedding_layer)
model.add(LSTM(128))
model.add(Dense(1,activation="sigmoid"))
model.compile(optimizer="adam",loss="binary_crossentropy",metrics=["acc"])
print(model.summary())
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_32 (Embedding) (None, 1024, 1024) 256000
_________________________________________________________________
lstm_17 (LSTM) (None, 128) 590336
_________________________________________________________________
dense_17 (Dense) (None, 1) 129
=================================================================
Total params: 846,465
Trainable params: 590,465
Non-trainable params: 256,000
_________________________________________________________________
但我得到的错误是:
InvalidArgumentError: indices[117,181] = -1 is not in [0, 250)
[[{{node embedding_32/embedding_lookup}}]]
为什么我会收到此错误?