ios - 使用 Metal Performance Shader 转换 MPSNNImageNode

Question

我目前正在使用 MPS 在 iOS（Swift4）上复制 YOLOv2（不是很小）。

一个问题是我很难实现 space_to_depth 函数（https://www.tensorflow.org/api_docs/python/tf/space_to_depth）和两个卷积结果的串联（13x13x256 + 13x13x1024 -> 13x13x1280）。你能给我一些关于制作这些零件的建议吗？我的代码如下。

...



let conv19 = MPSCNNConvolutionNode(source: conv18.resultImage,

                                 weights: DataSource("conv19", 3, 3, 1024, 1024))



let conv20 = MPSCNNConvolutionNode(source: conv19.resultImage,

                                 weights: DataSource("conv20", 3, 3, 1024, 1024))



let conv21 = MPSCNNConvolutionNode(source: conv13.resultImage,

                                 weights: DataSource("conv21", 1, 1, 512, 64))



/*****

    1. space_to_depth with conv21

    2. concatenate the result of conv20(13x13x1024) to the result of 1 (13x13x256)

    I need your help to implement this part!

******/

score 0 · Accepted Answer

我相信space_to_depth可以用卷积的形式表示：例如，对于具有维度的输入[1,2,2,1]，使用 4 个卷积核，每个卷积核输出一个数字到一个通道，即。[[1,0],[0,0]] [[0,1],[0,0]] [[0,0],[1,0]] [[0,0],[0,1]]，这应该将所有输入数字从空间维度放到深度维度。
MPS 实际上有一个 concat 节点。见这里：https ://developer.apple.com/documentation/metalperformanceshaders/mpsnnconcatenationnode

你可以像这样使用它： concatNode = [[MPSNNConcatenationNode alloc] initWithSources:@[layerA.resultImage, layerB.resultImage]];

score 0 · Accepted Answer

如果您正在使用高级接口和 MPSNNGraph，您应该只使用 MPSNNConcatenationNode，如上面刘天宇所述。

如果您正在使用低级接口，对自己周围的 MPSKernel 进行人工处理，则可以通过以下方式完成：

创建一个 1280 通道的目标图像来保存结果
正常运行第一个过滤器以产生结果的前 256 个通道
运行第二个过滤器以生成剩余通道，destinationFeatureChannelOffset 设置为 256。

这在所有情况下都应该足够了，除非数据不是 MPSKernel 的产物。在这种情况下，您需要自己复制它或使用线性神经元 (a=1,b=0) 之类的东西来完成它。

ios - 使用 Metal Performance Shader 转换 MPSNNImageNode

2 回答 2

Related

Reference