我正在尝试使用 Vision 和 CoreML 在尽可能接近实时的跟踪对象上执行样式转换。我正在使用 AVKit 来捕获视频,并使用 AVCaptureVideoDataOutputSampleBufferDelegate 来获取每一帧。
在高层次上,我的管道是:
1) 检测人脸
2) 更新预览层以在正确的屏幕位置绘制边界框
3)将原始图像裁剪为检测到的人脸
4)通过coreML模型运行人脸图像,得到新的图像作为输出
5)用新图像填充预览层(无论它们在哪里)
我希望在计算边界框后立即放置它们(在主线程上),然后在推理完成后填充它们。但是,我发现将 coreML 推理添加到管道(在 AVCaptureOutputQueue 或 CoreMLQueue 上),边界框在推理完成之前不会更新位置。也许我错过了在闭包中如何处理队列的一些东西。代码的(希望)相关部分如下。
我正在修改来自https://developer.apple.com/documentation/vision/tracking_the_user_s_face_in_real_time的代码。
public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
// omitting stuff that gets pixelBuffers etc formatted for use with Vision
// and sets up tracking requests
// Perform landmark detection on tracked faces
for trackingRequest in newTrackingRequests {
let faceLandmarksRequest = VNDetectFaceLandmarksRequest(completionHandler: { (request, error) in
guard let landmarksRequest = request as? VNDetectFaceLandmarksRequest,
let results = landmarksRequest.results as? [VNFaceObservation] else {
return
}
// Perform all UI updates (drawing) on the main queue,
//not the background queue on which this handler is being called.
DispatchQueue.main.async {
self.drawFaceObservations(results) //<<- places bounding box on the preview layer
}
CoreMLQueue.async{ //Queue for coreML uses
//get region of picture to crop for CoreML
let boundingBox = results[0].boundingBox
//crop the input frame to the detected object
let image: CVPixelBuffer = self.cropFrame(pixelBuffer: pixelBuffer, region: boundingBox)
//infer on region
let styleImage: CGImage = self.performCoreMLInference(on: image)
//on the main thread, place styleImage into the bounding box(CAShapeLayer)
DispatchQueue.main.async{
self.boundingBoxOverlayLayer?.contents = styleImage
}
}
})
do {
try requestHandler.perform(faceLandmarksRequest)
} catch let error as NSError {
NSLog("Failed Request: %@", error)
}
}
}
除了队列/同步问题之外,我认为减速的一个原因可能是将像素缓冲区裁剪到感兴趣的区域。我在这里没有想法,任何帮助将不胜感激