ios - 使用 Vision 框架跟踪眼睛

Question

如何在头部或相机移动时使用 iOS 11 中的新视觉框架来跟踪视频中的眼睛？（使用前置摄像头）。

我发现VNDetectFaceLandmarksRequest我的 iPad 速度非常慢 - 地标请求大约在 1-2 秒内执行一次。我觉得我做错了什么，但苹果网站上没有太多文档。

我已经在 Vision 上观看了 WWDC 2017 视频：

https://developer.apple.com/videos/play/wwdc2017/506/

以及阅读本指南：

https://github.com/jeffreybergier/Blog-Getting-Started-with-Vision

我的代码现在看起来大致是这样的（对不起，它是 Objective-C）：

// Capture session setup

- (BOOL)setUpCaptureSession {
    AVCaptureDevice *captureDevice = [AVCaptureDevice
                                      defaultDeviceWithDeviceType:AVCaptureDeviceTypeBuiltInWideAngleCamera
                                      mediaType:AVMediaTypeVideo
                                      position:AVCaptureDevicePositionFront];
    NSError *error;
    AVCaptureDeviceInput *captureInput = [AVCaptureDeviceInput deviceInputWithDevice:captureDevice error:&error];
    if (error != nil) {
        NSLog(@"Failed to initialize video input: %@", error);
        return NO;
    }

    self.captureOutputQueue = dispatch_queue_create("CaptureOutputQueue",
                                                    DISPATCH_QUEUE_SERIAL);

    AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init];
    captureOutput.alwaysDiscardsLateVideoFrames = YES;
    [captureOutput setSampleBufferDelegate:self queue:self.captureOutputQueue];

    self.captureSession = [[AVCaptureSession alloc] init];
    self.captureSession.sessionPreset = AVCaptureSessionPreset1280x720;
    [self.captureSession addInput:captureInput];
    [self.captureSession addOutput:captureOutput];

    return YES;
}

// Capture output delegate:

- (void)captureOutput:(AVCaptureOutput *)output
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection {
    if (!self.detectionStarted) {
        return;
    }

    CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    if (pixelBuffer == nil) {
        return;
    }

    NSMutableDictionary<VNImageOption, id> *requestOptions = [NSMutableDictionary dictionary];
    CFTypeRef cameraIntrinsicData = CMGetAttachment(sampleBuffer,
                                                    kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
                                                    nil);
    requestOptions[VNImageOptionCameraIntrinsics] = (__bridge id)(cameraIntrinsicData);

    // TODO: Detect device orientation
    static const CGImagePropertyOrientation orientation = kCGImagePropertyOrientationRight;

    VNDetectFaceLandmarksRequest *landmarksRequest =
        [[VNDetectFaceLandmarksRequest alloc] initWithCompletionHandler:^(VNRequest *request, NSError *error) {
            if (error != nil) {
                NSLog(@"Error while detecting face landmarks: %@", error);
            } else {
                dispatch_async(dispatch_get_main_queue(), ^{
                    // Draw eyes in two corresponding CAShapeLayers
                });
            }
        }];


    VNImageRequestHandler *requestHandler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:pixelBuffer
                                                                                     orientation:orientation
                                                                                         options:requestOptions];
    NSError *error;
    if (![requestHandler performRequests:@[landmarksRequest] error:&error]) {
        NSLog(@"Error performing landmarks request: %@", error);
        return;
    }
}

调用-performRequests:..与视频输出相同的队列是否正确？根据我的实验，此方法似乎同步调用请求的完成处理程序。我不应该在每一帧都调用这个方法吗？

为了加快速度，我还尝试VNTrackObjectRequest在视频上检测到地标后分别跟踪每只眼睛（通过从地标的区域点构建边界框），但这效果不佳（仍在尝试弄清楚）。

在视频上跟踪眼睛的最佳策略是什么？我应该跟踪面部矩形，然后在其区域内执行地标请求（会更快）吗？

ios - 使用 Vision 框架跟踪眼睛

0 回答 0

Related

Reference