如何在头部或相机移动时使用 iOS 11 中的新视觉框架来跟踪视频中的眼睛?(使用前置摄像头)。
我发现VNDetectFaceLandmarksRequest
我的 iPad 速度非常慢 - 地标请求大约在 1-2 秒内执行一次。我觉得我做错了什么,但苹果网站上没有太多文档。
我已经在 Vision 上观看了 WWDC 2017 视频:
https://developer.apple.com/videos/play/wwdc2017/506/
以及阅读本指南:
https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision
我的代码现在看起来大致是这样的(对不起,它是 Objective-C):
// Capture session setup
- (BOOL)setUpCaptureSession {
AVCaptureDevice *captureDevice = [AVCaptureDevice
defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
mediaType:AVMediaTypeVideo
position:AVCaptureDevicePositionFront];
NSError *error;
AVCaptureDeviceInput *captureInput = [AVCaptureDeviceInput deviceInputWithDevice:captureDevice error:&error];
if (error != nil) {
NSLog(@"Failed to initialize video input: %@", error);
return NO;
}
self.captureOutputQueue = dispatch_queue_create("CaptureOutputQueue",
DISPATCH_QUEUE_SERIAL);
AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init];
captureOutput.alwaysDiscardsLateVideoFrames = YES;
[captureOutput setSampleBufferDelegate:self queue:self.captureOutputQueue];
self.captureSession = [[AVCaptureSession alloc] init];
self.captureSession.sessionPreset = AVCaptureSessionPreset1280x720;
[self.captureSession addInput:captureInput];
[self.captureSession addOutput:captureOutput];
return YES;
}
// Capture output delegate:
- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection {
if (!self.detectionStarted) {
return;
}
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
if (pixelBuffer == nil) {
return;
}
NSMutableDictionary<VNImageOption, id> *requestOptions = [NSMutableDictionary dictionary];
CFTypeRef cameraIntrinsicData = CMGetAttachment(sampleBuffer,
kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
nil);
requestOptions[VNImageOptionCameraIntrinsics] = (__bridge id)(cameraIntrinsicData);
// TODO: Detect device orientation
static const CGImagePropertyOrientation orientation = kCGImagePropertyOrientationRight;
VNDetectFaceLandmarksRequest *landmarksRequest =
[[VNDetectFaceLandmarksRequest alloc] initWithCompletionHandler:^(VNRequest *request, NSError *error) {
if (error != nil) {
NSLog(@"Error while detecting face landmarks: %@", error);
} else {
dispatch_async(dispatch_get_main_queue(), ^{
// Draw eyes in two corresponding CAShapeLayers
});
}
}];
VNImageRequestHandler *requestHandler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:pixelBuffer
orientation:orientation
options:requestOptions];
NSError *error;
if (![requestHandler performRequests:@[landmarksRequest] error:&error]) {
NSLog(@"Error performing landmarks request: %@", error);
return;
}
}
调用-performRequests:..
与视频输出相同的队列是否正确?根据我的实验,此方法似乎同步调用请求的完成处理程序。我不应该在每一帧都调用这个方法吗?
为了加快速度,我还尝试VNTrackObjectRequest
在视频上检测到地标后分别跟踪每只眼睛(通过从地标的区域点构建边界框),但这效果不佳(仍在尝试弄清楚)。
在视频上跟踪眼睛的最佳策略是什么?我应该跟踪面部矩形,然后在其区域内执行地标请求(会更快)吗?