2

我创建了一个 CNN,用于根据大小的输入图像对三个类别进行分类39 x 39。我正在使用 Optuna 优化网络参数。对于 Optuna,我定义了以下参数进行优化:

num_blocks = trial.suggest_int('num_blocks', 1, 4)
num_filters = [int(trial.suggest_categorical("num_filters", [32, 64, 128, 256]))]
kernel_size = trial.suggest_int('kernel_size', 2, 7)
num_dense_nodes = trial.suggest_categorical('num_dense_nodes', [64, 128, 256, 512, 1024])
dense_nodes_divisor = trial.suggest_categorical('dense_nodes_divisor', [1, 2, 4, 8])
batch_size = trial.suggest_categorical('batch_size', [16, 32, 64, 128])
drop_out = trial.suggest_discrete_uniform('drop_out', 0.05, 0.5, 0.05)
lr = trial.suggest_loguniform('lr', 1e-6, 1e-1)

dict_params = {'num_blocks': num_blocks,
 'num_filters': num_filters,
 'kernel_size': kernel_size,
 'num_dense_nodes': num_dense_nodes,
 'dense_nodes_divisor': dense_nodes_divisor,
 'batch_size': batch_size,
 'drop_out': drop_out,
 'lr': lr}

我的网络如下所示:

input_tensor = Input(shape=(39,39,3))

# 1st cnn block
x = Conv2D(filters=dict_params['num_filters'],
 kernel_size=dict_params['kernel_size'],
 strides=1, padding='same')(input_tensor)
x = BatchNormalization()(x, training=training)
x = Activation('relu')(x)
x = MaxPooling2D(padding='same')(x)
x = Dropout(dict_params['drop_out'])(x)

# additional cnn blocks
for i in range(1, dict_params['num_blocks']):
    x = Conv2D(filters=dict_params['num_filters']*(2**i), kernel_size=dict_params['kernel_size'], strides=1, padding='same')(x)
    x = BatchNormalization()(x, training=training)
    x = Activation('relu')(x)
    x = MaxPooling2D(padding='same')(x)
    x = Dropout(dict_params['drop_out'])(x)

# mlp
x = Flatten()(x)
x = Dense(dict_params['num_dense_nodes'], activation='relu')(x)
x = Dropout(dict_params['drop_out'])(x)
x = Dense(dict_params['num_dense_nodes'] // dict_params['dense_nodes_divisor'], activation='relu')(x)
output_tensor = Dense(self.number_of_classes, activation='softmax')(x)

# instantiate and compile model
cnn_model = Model(inputs=input_tensor, outputs=output_tensor)
opt = Adam(lr=dict_params['lr'])
loss = 'categorical_crossentropy'
cnn_model.compile(loss=loss, optimizer=opt, metrics=['accuracy',  tf.keras.metrics.AUC()])

我正在使用 Optuna 优化(最小化)验证损失。网络中最多有 4 个块,每个块的过滤器数量加倍。这意味着例如第一个块中的 64 个,第二个块中的 128 个,第三个块中的 256 个,依此类推。有两个问题。首先,当我们从例如 256 个过滤器和总共 4 个块开始时,在最后一个块中将有 2048 个过滤器,这太多了。

是否可以使num_filters参数依赖于num_blocks参数?这意味着如果有更多块,则起始过滤器大小应该更小。因此,例如,如果num_blocks选择为 4,num_filters则应仅从 32、64 和 128 中采样。

其次,我认为将滤波器大小加倍是很常见的,但在最大池化层(类似于 VGG)之前,也有具有恒定滤波器大小或两个卷积(具有相同数量的滤波器)的网络等等。是否可以调整 Optuna 优化以涵盖所有这些变化?

4

0 回答 0