swift - 使用“VNImageHomographicAlignmentObservation”类合并图像

Question

我正在尝试使用合并两个图像VNImageHomographicAlignmentObservation，我目前得到一个看起来像这样的 3d 矩阵：

simd_float3x3([ [0.99229, -0.00451023, -4.32607e-07)],  
                [0.00431724,0.993118, 2.38839e-07)],   
                [-72.2425, -67.9966, 0.999288)]], )

但我不知道如何使用这些值合并到一个图像中。似乎没有任何文档说明这些值的含义。我在这里找到了一些关于转换矩阵的信息：Working with matrices。

但到目前为止，没有其他任何帮助我......有什么建议吗？

我的代码：

func setup() {

    let floatingImage = UIImage(named:"DJI_0333")!
    let referenceImage = UIImage(named: "DJI_0327")!

    let request = VNHomographicImageRegistrationRequest(targetedCGImage: floatingImage.cgImage!, options: [:])

    let handler = VNSequenceRequestHandler()
    try! handler.perform([request], on: referenceImage.cgImage!)

    if let results = request.results as? [VNImageHomographicAlignmentObservation] {
        print("Perspective warp found: \(results.count)")
        results.forEach { observation in
        // A matrix with 3 rows and 3 columns.                         
        let matrix = observation.warpTransform
        print(matrix) }
    }
}

score 5 · Accepted Answer

这个单应矩阵H描述了如何将一个图像投影到另一个图像的图像平面上。要将每个像素转换为其投影位置，您可以x' = H * x使用齐次坐标计算其投影位置（基本上取您的 2D 图像坐标，添加 1.0 作为第三个分量，应用矩阵H，然后通过除以的第三个分量返回 2D结果）。

对每个像素执行此操作的最有效方法是使用CoreImage在齐次空间中编写此矩阵乘法。CoreImage提供多种着色器内核类型CIColorKernel：CIWarpKernel和CIKernel. 对于这个任务，我们只想变换每个像素的位置，所以 aCIWarpKernel就是你所需要的。使用核心图像着色语言，看起来如下：

import CoreImage
let warpKernel = CIWarpKernel(source:
    """
    kernel vec2 warp(mat3 homography)
    {
        vec3 homogen_in = vec3(destCoord().x, destCoord().y, 1.0); // create homogeneous coord
        vec3 homogen_out = homography * homogen_in; // transform by homography
        return homogen_out.xy / homogen_out.z; // back to normal 2D coordinate
    }
    """
)

请注意，着色器需要一个mat3被调用homography的，它是simd_float3x3矩阵的着色语言等价物H。调用着色器时，矩阵应存储在 CIVector 中，以使用以下方式对其进行转换：

let (col0, col1, col2) = yourHomography.columns
let homographyCIVector = CIVector(values:[CGFloat(col0.x), CGFloat(col0.y), CGFloat(col0.z),
                                             CGFloat(col1.x), CGFloat(col1.y), CGFloat(col1.z),
                                             CGFloat(col2.x), CGFloat(col2.y), CGFloat(col2.z)], count: 9)

当您将应用于CIWarpKernel图像时，您必须告诉CoreImage输出应该有多大。要合并扭曲图像和参考图像，输出应该足够大以覆盖整个投影图像和原始图像。我们可以通过将单应性应用于图像矩形的每个角来计算投影图像的大小（这次在 Swift 中，CoreImage 将此矩形称为范围）：

/**
 * Convert a 2D point to a homogeneous coordinate, transform by the provided homography,
 * and convert back to a non-homogeneous 2D point.
 */
func transform(_ point:CGPoint, by homography:matrix_float3x3) -> CGPoint
{
  let inputPoint = float3(Float(point.x), Float(point.y), 1.0)
  var outputPoint = homography * inputPoint
  outputPoint /= outputPoint.z
  return CGPoint(x:CGFloat(outputPoint.x), y:CGFloat(outputPoint.y))
}

func computeExtentAfterTransforming(_ extent:CGRect, with homography:matrix_float3x3) -> CGRect
{
  let points = [transform(extent.origin, by: homography),
                transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y), by: homography),
                transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y + extent.height), by: homography),
                transform(CGPoint(x: extent.origin.x, y:extent.origin.y + extent.height), by: homography)]

  var (xmin, xmax, ymin, ymax) = (points[0].x, points[0].x, points[0].y, points[0].y)
  points.forEach { p in
    xmin = min(xmin, p.x)
    xmax = max(xmax, p.x)
    ymin = min(ymin, p.y)
    ymax = max(ymax, p.y)
  }
  let result = CGRect(x: xmin, y:ymin, width: xmax-xmin, height: ymax-ymin)
  return result
}

let warpedExtent = computeExtentAfterTransforming(ciFloatingImage.extent, with: homography.inverse)
let outputExtent = warpedExtent.union(ciFloatingImage.extent)

现在您可以创建浮动图像的变形版本：

let ciFloatingImage = CIImage(image: floatingImage)
let ciWarpedImage = warpKernel.apply(extent: outputExtent, roiCallback:
    {
        (index, rect) in
        return computeExtentAfterTransforming(rect, with: homography.inverse)
    },
    image: inputImage,
    arguments: [homographyCIVector])!

roiCallback用于告诉CoreImage需要输入图像的哪一部分来计算输出的某个部分。CoreImage 使用它来将着色器逐块应用于图像的某些部分，以便它可以处理巨大的图像。（请参阅Apple 文档中的创建自定义过滤器）。一个快速的技巧就是永远return CGRect.infinite在这里，但是 CoreImage 不能做任何块魔法。

最后，创建参考图像和变形图像的合成图像：

let ciReferenceImage = CIImage(image: referenceImage)
let ciResultImage = ciWarpedImage.composited(over: ciReferenceImage)
let resultImage = UIImage(ciImage: ciResultImage)

swift - 使用“VNImageHomographicAlignmentObservation”类合并图像

1 回答 1

Related

Reference