0

我建立了以下非常简单的模型:

inp = tf.keras.layers.Input((32,32,3))
x = tf.keras.layers.Conv2D(filters=1,kernel_size=3,strides=2, padding='same')(inp)
x = tf.nn.relu(x)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(units=10,
                            activation='linear')(x)
outp = x
model = tf.keras.models.Model(inp,outp)

我想在 ReLU 层之后插入一个 dropout 层,所以我遵循了这篇文章的答案中描述的方法。

这是我的代码:

import re
from keras.models import Model

def insert_layer_nonseq(model, layer_regex, insert_layer_factory,
                        insert_layer_name=None, position='after'):

    # Auxiliary dictionary to describe the network graph
    network_dict = {'input_layers_of': {}, 'new_output_tensor_of': {}}

    # Set the input layers of each layer
    for layer in model.layers:
        for node in layer._outbound_nodes:
            layer_name = node.outbound_layer.name
            if layer_name not in network_dict['input_layers_of']:
                network_dict['input_layers_of'].update(
                        {layer_name: [layer.name]})
            else:
                network_dict['input_layers_of'][layer_name].append(layer.name)

    # Set the output tensor of the input layer
    network_dict['new_output_tensor_of'].update(
            {model.layers[0].name: model.input})

    # Iterate over all layers after the input
    model_outputs = []
    count=0
    for layer in model.layers[1:]:
        count+=1

        # Determine input tensors
        layer_input = [network_dict['new_output_tensor_of'][layer_aux] 
                for layer_aux in network_dict['input_layers_of'][layer.name]]
        if len(layer_input) == 1:
            layer_input = layer_input[0]

        # Insert layer if name matches the regular expression
        if re.match(layer_regex, layer.name):
            if position == 'replace':
                x = layer_input
            elif position == 'after':
                x = layer(layer_input)
            elif position == 'before':
                pass
            else:
                raise ValueError('position must be: before, after or replace')

            new_layer = insert_layer_factory()
            x = new_layer(x)
            print('New layer: {} Old layer: {} Type: {}'.format(new_layer.name,
                                                            layer.name, position))
            if position == 'before':
                x = layer(x)
        else:
            x = layer(layer_input)

        # Set new output tensor (the original one, or the one of the inserted
        # layer)
        network_dict['new_output_tensor_of'].update({layer.name: x})

        # Save tensor in output list if it is output in initial model
        if layer_name in model.output_names:
            model_outputs.append(x)

    return Model(inputs=model.inputs, outputs=model_outputs)



clone_model = tf.keras.models.clone_model(model)

def dropout_layer_factory():
    return tf.keras.layers.Dropout(rate=0.2, name='dropout')
nm = insert_layer_nonseq(clone_model, '.*relu.*', dropout_layer_factory)

# Fix possible problems with new model
nm.save('temp.h5')
nm = load_model('temp.h5')

以下是结果模型 (nm) 的摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_19 (InputLayer)        [(None, 32, 32, 3)]       0         
_________________________________________________________________
conv2d_25 (Conv2D)           (None, 16, 16, 1)         28        
_________________________________________________________________
tf.nn.relu_25 (TFOpLambda)   (None, 16, 16, 1)         0         
_________________________________________________________________
dropout (Dropout)            (None, 16, 16, 1)         0         
_________________________________________________________________
global_average_pooling2d_25  (None, 1)                 0         
_________________________________________________________________
dense_25 (Dense)             (None, 10)                20        
=================================================================
Total params: 48
Trainable params: 48
Non-trainable params: 0
_________________________________________________________________

在我看来,一切看起来都很棒。但是,当我尝试训练模型时,出现以下错误:

InvalidArgumentError:  logits and labels must have the same first dimension, got logits shape [8192,1] and labels shape [32]
     [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at <ipython-input-123-9e6a1b98c0a1>:7) ]] [Op:__inference_train_function_111449]

与大多数 logits/labels 错误不同,损失函数不是这里的问题。当我使用完全相同的代码训练原始模型时,它可以完美运行。不知何故,插入 dropout 层引入了一个不允许新模型训练的错误。

有谁知道为什么会这样?谢谢!

4

0 回答 0