keras - 使用 keras 进行批量标准化的 dnn 层的理论问题

Question

我在理解使用批量标准化的 DNN 模型时遇到了一些麻烦，特别是使用 keras。有人可以解释一下我构建的这个模型中每一层的结构和内容吗？

modelbatch = Sequential()
modelbatch.add(Dense(512, input_dim=1120))
modelbatch.add(BatchNormalization())
modelbatch.add(Activation('relu'))
modelbatch.add(Dropout(0.5))

modelbatch.add(Dense(256))
modelbatch.add(BatchNormalization())
modelbatch.add(Activation('relu'))
modelbatch.add(Dropout(0.5))

modelbatch.add(Dense(num_classes))
modelbatch.add(BatchNormalization())
modelbatch.add(Activation('softmax'))
# Compile model
modelbatch.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Train the model
start = time.time()
model_info = modelbatch.fit(X_2, y_2, batch_size=500, \
                         epochs=20, verbose=2, validation_data=(X_test, y_test))
end = time.time()

我认为这是我模型的所有层：

print(modelbatch.layers[0].get_weights()[0].shape)
(1120, 512)
print(modelbatch.layers[0].get_weights()[1].shape)
(512,)
print(modelbatch.layers[1].get_weights()[0].shape)
(512,)
print(modelbatch.layers[1].get_weights()[1].shape)
(512,)
print(modelbatch.layers[1].get_weights()[2].shape)
(512,)
print(modelbatch.layers[1].get_weights()[3].shape)
(512,)
print(modelbatch.layers[4].get_weights()[0].shape)
(512, 256)
print(modelbatch.layers[4].get_weights()[1].shape)
(256,)
print(modelbatch.layers[5].get_weights()[0].shape)
(256,)
print(modelbatch.layers[5].get_weights()[1].shape)
(256,)
print(modelbatch.layers[5].get_weights()[2].shape)
(256,)
print(modelbatch.layers[5].get_weights()[3].shape)
(256,)
print(modelbatch.layers[8].get_weights()[0].shape)
(256, 38)
print(modelbatch.layers[8].get_weights()[1].shape)
(38,)
print(modelbatch.layers[9].get_weights()[0].shape)
(38,)
print(modelbatch.layers[9].get_weights()[1].shape)
(38,)
print(modelbatch.layers[9].get_weights()[2].shape)
(38,)
print(modelbatch.layers[9].get_weights()[3].shape)
(38,)

我将感谢您的帮助，在此先感谢。

score 1 · Accepted Answer

让我们看看你的模型：

你有一个维度为 1120 的输入层，连接到那个，你有你的第一个隐藏层，有 512 个神经元，在你有你的批量标准化层之后。之后是你的激活函数，然后是你的 dropout 层。请注意，您可以使用该命令model.summary()来可视化您的模型

理论上，您可以（并且应该）将这些层视为应用以下转换的一层：批量标准化、激活和丢弃。在实践中，每一层都在 Keras 中单独实现，因为您获得了实现的模块化：用户可以选择添加到层批量规范或丢弃，而不是编码所有可能的层设计方式。要查看模块化实现，我建议您查看http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture4.pdf和http://cs231n.stanford.edu/syllabus.html如果你想获得更深层次的知识。

对于批标准化层，您可以注意到 4 个参数：两个可调整参数：gamma 和 beta，以及由数据设置的两个参数（均值和标准差）。要了解它是什么，请查看斯坦福课程，您还可以在关于批量标准化的原始论文https://arxiv.org/abs/1502.03167中找到它。这只是通过在每一层标准化数据来提高学习速度和提高准确性的一个技巧，就像您在输入数据的预处理步骤中所做的那样。

根据我所说，您可以推断出模型的其余部分。

注意：我不会在 softmax 之前的最后一步中使用批标准化层。

是不是更清楚了？

keras - 使用 keras 进行批量标准化的 dnn 层的理论问题

1 回答 1

Related

Reference