我目前正在用 tensorflow 训练多个线性分类器,但我发现了一些奇怪的东西。
如果 batch_size 很小,我的结果会更好(模型学得更快)我正在研究 FashionMNIST
epochs = 300
batch_size = 5000
# Create and fit model
model = tf.keras.Sequential()
model.add(Dense(1, activation="linear", input_dim=28*28))
model.add(Dense(10, activation="softmax", input_dim=1))
model.compile(optimizer=Adam(), loss=[categorical_crossentropy], metrics=[categorical_accuracy])
model.fit(x_train, y_one_hot_train, validation_data=(x_val, y_one_hot_val), epochs=epochs, batch_size=batch_size)
结果
批量大小:20000 和 200 个 epoch
loss: 2.7494 - categorical_accuracy: 0.2201 - val_loss: 2.8695 - val_categorical_accuracy: 0.2281
批量大小:10000 和 200 个 epoch
loss: 1.7487 - categorical_accuracy: 0.3336 - val_loss: 1.8268 - val_categorical_accuracy: 0.3331
批量大小:2000 和 200 个 epoch
loss: 1.2906 - categorical_accuracy: 0.5123 - val_loss: 1.3247 - val_categorical_accuracy: 0.5113
批量大小:1000 和 200 个 epoch
loss: 1.1080 - categorical_accuracy: 0.5246 - val_loss: 1.1261 - val_categorical_accuracy: 0.5273
你知道我为什么会得到这样的结果吗?