python - keras - 用于语义分割的每像素、非标准化、softmax 损失

Question

我正在对用于语义分割的 Keras/TensorFlow U-Net 进行故障排除。我不断遇到的一件事是这样的说法......

“现在的问题是在你的情况下使用 softmax，因为 Keras 不支持每个像素上的 softmax。”

这在此处（以及许多其他地方）进行了说明：语义分割 Keras 的交叉熵损失

他们典型的解决方案是将行和列展开为 1d，因此形状 (batch, rows, cols, n_classes) 的输出 (4d) 张量变为 3d 张量 (batch, rows*cols, n_classes) 并对其应用密集的 softmax。

但是这个虚拟示例让我认为具有 1x1 内核和“softmax”激活的 Conv2d 层确实执行每个像素的 softmax。

我错了吗？

例子：

import numpy as np
np.random.seed(345)

from keras.layers import Input
from keras.models import Model 
from keras.layers.convolutional import Conv2D
from keras.initializers import RandomUniform

在这种情况下，模拟输入到每像素 softmax 分类器的 3x3 字段，具有 5 个过滤器

np.random.seed(234)
dummy_input = np.random.random(45).astype('float32')
dummy_input = dummy_input.reshape((1,3,3,5))

构建 2 个具有相同随机权重的网络（一个线性，一个 softmax）

conv_input = Input(shape=(3,3,5,), dtype='float32')

pred_layer_softmax = Conv2D(filters=2, kernel_size=(1,1), strides=(1,1), kernel_initializer=RandomUniform(minval=0., maxval=1.), 
                            padding='valid', data_format='channels_last', activation='softmax')(conv_input)

pred_layer_linear = Conv2D(filters=2, kernel_size=(1,1), strides=(1,1),
                           padding='valid', data_format='channels_last', activation='linear')(conv_input)

楷模

m_softmax = Model(conv_input, pred_layer_softmax)
m_linear = Model(conv_input, pred_layer_linear)

# keep weights the same for both networks
m_linear.set_weights(m_softmax.get_weights())

预测

pred     = m_softmax.predict(dummy_input)
pred_lin = m_linear.predict(dummy_input)

这两个网络是否做出相同的预测？是的。

print('\nsoftmax pred')
print(pred.argmax(axis=3))
print('\nlinear_pred')
print(pred_lin.argmax(axis=3))

softmax pred
[[[1 1 0]
  [1 1 0]
  [1 0 1]]]

linear_pred
[[[1 1 0]
  [1 1 0]
  [1 0 1]]]

对于 softmax，每个像素类概率的总和是否为 1.0？是的。

print('\nsoftmax - sum class probs')
print(pred.sum(axis=3))
print('\nlinear - sum class probs')
print(pred_lin.sum(axis=3))

softmax - sum class probs
[[[ 1.  1.  1.]
  [ 1.  1.  1.]
  [ 1.  1.  1.]]]

linear - sum class probs
[[[ 1.88952112  2.50639653  2.06084657]
  [ 1.81122136  2.21819067  2.01038122]
  [ 1.92753291  1.85993922  2.27295876]]]

我错过了什么？这看起来像每像素 softmax 工作正常，对吧？

有一些我不理解的基本内容吗？

提前致谢。

python - keras - 用于语义分割的每像素、非标准化、softmax 损失

0 回答 0

Related

Reference