tensorflow - 当只需要第一个元素时，为什么要创建一个新轴？

Question

首先，抱歉标题含糊

由于我有兴趣了解有关 TensorFlow 和图像分割的更多信息，因此我正在关注他们的教程 ( https://www.tensorflow.org/tutorials/images/segmentation )。但是，我注意到一些我无法完全掌握的东西，也不是在谷歌搜索之后。

在这个部分：

def create_mask(pred_mask):
    pred_mask = tf.argmax(pred_mask, axis=-1)
    pred_mask = pred_mask[..., tf.newaxis]
    return pred_mask[0]

首先为 pred_mask 向量创建一个新轴，然后只选择第一个元素的原因是什么？为什么不像我预期的那样，如下图所示：

def create_mask(pred_mask):
    pred_mask = tf.argmax(pred_mask, axis=-1)
    return pred_mask

score 3 · Accepted Answer

只是为了保持图像是 3D 张量。例如，如果您有形状预测(1, 256, 256, 10)（一批 256x256 图像 10 类），tf.argmax()那么您将收到一个形状张量(1, 256, 256)（一批 256x256 图像，没有通道）。但通常情况下，如果图像是 HWC 格式(Height, Width, Channel)而不是(Height, Width). 例如，如果您使用 matplotlib 或 OpenCV，通常需要 HWC 图像。

score 0 · Accepted Answer

调用tf.argmaxwithaxis=-1使张量松开最后一个通道。这通过添加回来作为单例通道tf.newaxis。

然后返回批次的第一个元素。简而言之：

(batch_size, height, width, channels)  # original tensor shape
(batch_size, height, width)            # after argmax
(batch_size, height, width, 1)         # after unsqueeze
(height, width, 1)                     # this is what you are returning

tensorflow - 当只需要第一个元素时，为什么要创建一个新轴？

2 回答 2

Related

Reference