python - 灵活的形状不适用于使用 coremltools 4 的 ONNX 到 MLModel 转换

Question

我无法使用ONNX要转换为MLModelusing的模型获得灵活的形状coremltools 4.0。源模型来自PyTorch，但我不能使用新的统一转换，因为此时 coremltools 不支持reflection_pad2d模型中使用的层。

coremltools编译模型没有任何警告或错误，并显示支持灵活的形状：

input {
  name: "input"
  type {
    imageType {
      width: 1024
      height: 1024
      colorSpace: BGR
      imageSizeRange {
        widthRange {
          lowerBound: 256
          upperBound: -1
        }
        heightRange {
          lowerBound: 256
          upperBound: -1
        }
      }
    }
  }
}
output {
  name: "output"
  type {
    imageType {
      width: 1024
      height: 1024
      colorSpace: RGB
      imageSizeRange {
        widthRange {
          lowerBound: 256
          upperBound: -1
        }
        heightRange {
          lowerBound: 256
          upperBound: -1
        }
      }
    }
  }
}

但是在模型上运行预测失败并显示以下消息：

MyApp[5773:4974761] [espresso] [Espresso::handle_ex_plan] exception=Invalid X-dimension 1/814 status=-7
MyApp[5773:4974761] [coreml] Error binding image input buffer input: -7
MyApp[5773:4974761] [coreml] Failure in bindInputsAndOutputs.
prediction error: Error Domain=com.apple.CoreML Code=0 "Error binding image input buffer input." UserInfo={NSLocalizedDescription=Error binding image input buffer input.}

枚举形状将与模型一起使用，但如果没有 10k+ 枚举形状，这似乎不是一个解决方案，这是不够的。

该模型是一个完全卷积网络，它似乎没有使用任何固定的形状（参见规范输出），并且它适用于不同的形状PyTorch，所以看起来必须有可能以某种方式让灵活的形状工作。

我尝试使用图像输入/输出来使用灵活的输入形状：

input_names=['input']
output_names=['output']
channels = 3
input_shape = ct.Shape(shape=(channels, ct.RangeDim(), ct.RangeDim()))
#also tried:
input_shape = ct.Shape(shape=(channels, ct.RangeDim(256, 4096), ct.RangeDim(256, 4096)))
#and:
input_shape = ct.Shape(shape=(channels, ct.RangeDim(256, -1), ct.RangeDim(256, -1)))

model_input = ct.TensorType(shape=input_shape)
mlmodel = convert('torch_model.onnx',
            [model_input], 
            image_input_names=input_names,
            image_output_names=output_names,
            ...
)

spec = mlmodel.get_spec()

#tried with and without adding flexible shapes
spec = add_flexible_shapes(spec)

def add_flexible_shapes(spec):
    img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange(height_range=(256, -1), width_range=(256, -1))
    #also tried:
    #img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange(height_range=(256, 4096), width_range=(256, 4096))
    flexible_shape_utils.update_image_size_range(spec, feature_name=input_names[0], size_range=img_size_ranges)
    flexible_shape_utils.update_image_size_range(spec, feature_name=output_names[0], size_range=img_size_ranges)
    return spec

我还尝试先将模型转换为多数组，然后转换为图像，然后添加灵活的形状。

 torch.onnx.export(torch_model, example_input, 'torch_model.onnx', input_names=input_names, output_names=output_names, verbose=True)
 mlmodel = ct.converters.onnx.convert(model='torch_model.onnx',
                                 ...
 spec = mlmodel.get_spec()

 input = spec.description.input[0]
 input.type.imageType.colorSpace = ft.ImageFeatureType.RGB
 input.type.imageType.height = 1024
 input.type.imageType.width = 1024

 output = spec.description.output[0]
 output.type.imageType.colorSpace = ft.ImageFeatureType.RGB
 output.type.imageType.height = 1024
 output.type.imageType.width = 1024
                                     
 spec = add_flexible_shapes(spec)

我查看了规范中的所有层，但没有看到任何使用固定形状的层（输入/输出层除外）：

specificationVersion: 4
description {
  input {
    name: "input"
    type {
      imageType {
        width: 1024
        height: 1024
        colorSpace: RGB
      }
    }
  }
  output {
    name: "output"
    type {
      imageType {
        width: 1024
        height: 1024
        colorSpace: RGB
      }
    }
  }
  metadata {
    userDefined {
      key: "com.github.apple.coremltools.source"
      value: "onnx==1.7.0"
    }
    userDefined {
      key: "com.github.apple.coremltools.version"
      value: "4.0"
    }
  }
}
neuralNetwork {
  layers {
    name: "Pad_0"
    input: "input"
    output: "63"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 4
          endEdgeSize: 4
        }
        borderAmounts {
          startEdgeSize: 4
          endEdgeSize: 4
        }
      }
    }
  }
  layers {
    name: "Conv_1"
    input: "63"
    output: "64"
    convolution {
      outputChannels: 16
      kernelChannels: 3
      nGroups: 1
      kernelSize: 9
      kernelSize: 9
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_2"
    input: "64"
    output: "65"
    batchnorm {
      channels: 16
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_3"
    input: "65"
    output: "66"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_4"
    input: "66"
    output: "67"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_5"
    input: "67"
    output: "68"
    convolution {
      outputChannels: 32
      kernelChannels: 16
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 2
      stride: 2
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_6"
    input: "68"
    output: "69"
    batchnorm {
      channels: 32
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_7"
    input: "69"
    output: "70"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_8"
    input: "70"
    output: "71"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_9"
    input: "71"
    output: "72"
    convolution {
      outputChannels: 64
      kernelChannels: 32
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 2
      stride: 2
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_10"
    input: "72"
    output: "73"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_11"
    input: "73"
    output: "74"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_12"
    input: "74"
    output: "75"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_13"
    input: "75"
    output: "76"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_14"
    input: "76"
    output: "77"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_15"
    input: "77"
    output: "78"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_16"
    input: "78"
    output: "79"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_17"
    input: "79"
    output: "80"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_18"
    input: "80"
    output: "81"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Add_19"
    input: "81"
    input: "74"
    output: "82"
    addBroadcastable {
    }
  }
  layers {
    name: "Pad_20"
    input: "82"
    output: "83"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_21"
    input: "83"
    output: "84"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_22"
    input: "84"
    output: "85"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_23"
    input: "85"
    output: "86"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_24"
    input: "86"
    output: "87"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_25"
    input: "87"
    output: "88"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_26"
    input: "88"
    output: "89"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Add_27"
    input: "89"
    input: "82"
    output: "90"
    addBroadcastable {
    }
  }
  layers {
    name: "Pad_28"
    input: "90"
    output: "91"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_29"
    input: "91"
    output: "92"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_30"
    input: "92"
    output: "93"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_31"
    input: "93"
    output: "94"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_32"
    input: "94"
    output: "95"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_33"
    input: "95"
    output: "96"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_34"
    input: "96"
    output: "97"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Add_35"
    input: "97"
    input: "90"
    output: "98"
    addBroadcastable {
    }
  }
  layers {
    name: "Pad_36"
    input: "98"
    output: "99"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_37"
    input: "99"
    output: "100"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_38"
    input: "100"
    output: "101"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_39"
    input: "101"
    output: "102"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_40"
    input: "102"
    output: "103"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_41"
    input: "103"
    output: "104"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_42"
    input: "104"
    output: "105"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Add_43"
    input: "105"
    input: "98"
    output: "106"
    addBroadcastable {
    }
  }
  layers {
    name: "Pad_44"
    input: "106"
    output: "107"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_45"
    input: "107"
    output: "108"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_46"
    input: "108"
    output: "109"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_47"
    input: "109"
    output: "110"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_48"
    input: "110"
    output: "111"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_49"
    input: "111"
    output: "112"
    convolution {
      outputChannels: 64
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_50"
    input: "112"
    output: "113"
    batchnorm {
      channels: 64
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Add_51"
    input: "113"
    input: "106"
    output: "114"
    addBroadcastable {
    }
  }
  layers {
    name: "Upsample_52"
    input: "114"
    output: "123"
    upsample {
      scalingFactor: 4
      scalingFactor: 4
    }
  }
  layers {
    name: "Pad_53"
    input: "123"
    output: "124"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_54"
    input: "124"
    output: "125"
    convolution {
      outputChannels: 32
      kernelChannels: 64
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 2
      stride: 2
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_55"
    input: "125"
    output: "126"
    batchnorm {
      channels: 32
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_56"
    input: "126"
    output: "127"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Upsample_57"
    input: "127"
    output: "136"
    upsample {
      scalingFactor: 4
      scalingFactor: 4
      mode: BILINEAR
    }
  }
  layers {
    name: "Pad_58"
    input: "136"
    output: "137"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
        borderAmounts {
          startEdgeSize: 1
          endEdgeSize: 1
        }
      }
    }
  }
  layers {
    name: "Conv_59"
    input: "137"
    output: "138"
    convolution {
      outputChannels: 16
      kernelChannels: 32
      nGroups: 1
      kernelSize: 3
      kernelSize: 3
      stride: 2
      stride: 2
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  layers {
    name: "InstanceNormalization_60"
    input: "138"
    output: "139"
    batchnorm {
      channels: 16
      computeMeanVar: true
      instanceNormalization: true
      epsilon: 9.999999747378752e-06
      gamma {
      }
      beta {
      }
    }
  }
  layers {
    name: "Relu_61"
    input: "139"
    output: "140"
    activation {
      ReLU {
      }
    }
  }
  layers {
    name: "Pad_62"
    input: "140"
    output: "141"
    padding {
      reflection {
      }
      paddingAmounts {
        borderAmounts {
          startEdgeSize: 4
          endEdgeSize: 4
        }
        borderAmounts {
          startEdgeSize: 4
          endEdgeSize: 4
        }
      }
    }
  }
  layers {
    name: "Conv_63"
    input: "141"
    output: "output"
    convolution {
      outputChannels: 3
      kernelChannels: 16
      nGroups: 1
      kernelSize: 9
      kernelSize: 9
      stride: 1
      stride: 1
      dilationFactor: 1
      dilationFactor: 1
      valid {
        paddingAmounts {
          borderAmounts {
          }
          borderAmounts {
          }
        }
      }
      hasBias: true
      weights {
      }
      bias {
      }
    }
  }
  arrayInputShapeMapping: EXACT_ARRAY_MAPPING
  imageInputShapeMapping: RANK4_IMAGE_MAPPING
}

python - 灵活的形状不适用于使用 coremltools 4 的 ONNX 到 MLModel 转换

0 回答 0

Related

Reference