我建立了以下非常简单的模型:
inp = tf.keras.layers.Input((32,32,3))
x = tf.keras.layers.Conv2D(filters=1,kernel_size=3,strides=2, padding='same')(inp)
x = tf.nn.relu(x)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(units=10,
activation='linear')(x)
outp = x
model = tf.keras.models.Model(inp,outp)
我想在 ReLU 层之后插入一个 dropout 层,所以我遵循了这篇文章的答案中描述的方法。
这是我的代码:
import re
from keras.models import Model
def insert_layer_nonseq(model, layer_regex, insert_layer_factory,
insert_layer_name=None, position='after'):
# Auxiliary dictionary to describe the network graph
network_dict = {'input_layers_of': {}, 'new_output_tensor_of': {}}
# Set the input layers of each layer
for layer in model.layers:
for node in layer._outbound_nodes:
layer_name = node.outbound_layer.name
if layer_name not in network_dict['input_layers_of']:
network_dict['input_layers_of'].update(
{layer_name: [layer.name]})
else:
network_dict['input_layers_of'][layer_name].append(layer.name)
# Set the output tensor of the input layer
network_dict['new_output_tensor_of'].update(
{model.layers[0].name: model.input})
# Iterate over all layers after the input
model_outputs = []
count=0
for layer in model.layers[1:]:
count+=1
# Determine input tensors
layer_input = [network_dict['new_output_tensor_of'][layer_aux]
for layer_aux in network_dict['input_layers_of'][layer.name]]
if len(layer_input) == 1:
layer_input = layer_input[0]
# Insert layer if name matches the regular expression
if re.match(layer_regex, layer.name):
if position == 'replace':
x = layer_input
elif position == 'after':
x = layer(layer_input)
elif position == 'before':
pass
else:
raise ValueError('position must be: before, after or replace')
new_layer = insert_layer_factory()
x = new_layer(x)
print('New layer: {} Old layer: {} Type: {}'.format(new_layer.name,
layer.name, position))
if position == 'before':
x = layer(x)
else:
x = layer(layer_input)
# Set new output tensor (the original one, or the one of the inserted
# layer)
network_dict['new_output_tensor_of'].update({layer.name: x})
# Save tensor in output list if it is output in initial model
if layer_name in model.output_names:
model_outputs.append(x)
return Model(inputs=model.inputs, outputs=model_outputs)
clone_model = tf.keras.models.clone_model(model)
def dropout_layer_factory():
return tf.keras.layers.Dropout(rate=0.2, name='dropout')
nm = insert_layer_nonseq(clone_model, '.*relu.*', dropout_layer_factory)
# Fix possible problems with new model
nm.save('temp.h5')
nm = load_model('temp.h5')
以下是结果模型 (nm) 的摘要:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_19 (InputLayer) [(None, 32, 32, 3)] 0
_________________________________________________________________
conv2d_25 (Conv2D) (None, 16, 16, 1) 28
_________________________________________________________________
tf.nn.relu_25 (TFOpLambda) (None, 16, 16, 1) 0
_________________________________________________________________
dropout (Dropout) (None, 16, 16, 1) 0
_________________________________________________________________
global_average_pooling2d_25 (None, 1) 0
_________________________________________________________________
dense_25 (Dense) (None, 10) 20
=================================================================
Total params: 48
Trainable params: 48
Non-trainable params: 0
_________________________________________________________________
在我看来,一切看起来都很棒。但是,当我尝试训练模型时,出现以下错误:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [8192,1] and labels shape [32]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-123-9e6a1b98c0a1>:7) ]] [Op:__inference_train_function_111449]
与大多数 logits/labels 错误不同,损失函数不是这里的问题。当我使用完全相同的代码训练原始模型时,它可以完美运行。不知何故,插入 dropout 层引入了一个不允许新模型训练的错误。
有谁知道为什么会这样?谢谢!