我正在用unet做图像语义分割工作。我对像素分类的最后一层感到困惑。Unet代码是这样的:
...
reshape = Reshape((n_classes,self.img_rows * self.img_cols))(conv9)
permute = Permute((2,1))(reshape)
activation = Activation('softmax')(permute)
model = Model(input = inputs, output = activation)
return model
...
我可以在不使用这样的 Permute 的情况下进行重塑吗?
reshape = Reshape((self.img_rows * self.img_cols, n_classes))(conv9)
更新:
使用直接reshape方式时,我发现训练结果不正确:
reshape = Reshape((self.img_rows * self.img_cols, n_classes))(conv9) // the loss is not convergent
我的groundtruth是这样生成的:
X = []
Y = []
im = cv2.imread(impath)
X.append(im)
seg_labels = np.zeros((height, width, n_classes))
for spath in segpaths:
mask = cv2.imread(spath, 0)
seg_labels[:, :, c] += mask
Y.append(seg_labels.reshape(width*height, n_classes))
为什么直接reshape不起作用?