ios - Get the cvPixelBuffer used in a VNImageRequestHandler on the VNDetectTextRectanglesRequest completion handler

Question

I am creating my request with the following code:

let textRequest = VNDetectTextRectanglesRequest(completionHandler: 
self.detectTextHandler)
textRequest.reportCharacterBoxes = true
self.requests = [textRequest]

And inside my AVCaptureVideoDataOutputSampleBufferDelegate I am creating a VNImageRequestHandler and performing it:

let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation(rawValue: 6)!, options: requestOptions)

do {
    try imageRequestHandler.perform(self.requests)
} catch {
    print(error)
}

This gives me the results of the detection inside the handler that has the following signature:

func detectTextHandler(request: VNRequest, error: Error?)

My question is, how can i get the "cvPixelBuffer" that this request used for further processing? Am I supposed to store a temporal version of it?

score 3 · Accepted Answer

I cannot find any methods or properties to retrieve CVPixelBuffer from a VNRequest.

So, capturing it inside the closure of completionHandler would be a simple way:

In the method of AVCaptureVideoDataOutputSampleBufferDelegate:

    let pixelBuffer = ...
    let requestOptions: [VNImageOption: Any] = ...

    let textRequest = VNDetectTextRectanglesRequest {request, error in
        //### Capture `pixelBuffer` inside this closure.
        self.detectText(from: pixelBuffer, request: request, error: error)
    }
    textRequest.reportCharacterBoxes = true

    let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation(rawValue: 6)!, options: requestOptions)

    do {
        try imageRequestHandler.perform([textRequest])
    } catch {
        print(error)
    }

And use it as:

func detectText(from buffer: CVPixelBuffer, request: VNRequest, error: Error?) {
    //### Use `buffer` passed from the closure.
    //...
}

score 1 · Accepted Answer

This is a good question.

I've encountered a similar issue for my app (https://github.com/snakajima/MobileNet-iOS), which needs to keep the reference to the CMSampleBuffer object until the completion handler was called (so that the associated pixelBuffer won't be reused by the video capture session).

I've worked around it by storing it as a property of the view controller (self.sampleBuffer). As the result, it can process only one pixelBuffer at a time -- which is fine for my app but not optimal.

If you need to do double buffering (or more), you need to introduce a queue (of pixelBuffers), assuming the order of completions is the same as requests -- which is the reasonable assumption considering the underlying architecture.

score 0 · Accepted Answer

https://github.com/maxvol/RxVision makes it easy without recreating the request for every image (as in the accepted answer).

let mlRequest: RxVNCoreMLRequest<CGImage> = VNCoreMLRequest.rx.request(model: model, imageCropAndScaleOption: .scaleFit)

mlRequest
    .observable
    .subscribe { [unowned self] (event) in
        switch event {
            case .next(let completion):       
                let cgImage = completion.value // NB you can easily pass the value along to the completion handler 
                if let result = completion.request.results?[0] as? VNClassificationObservation {
                    os_log("results: %@", type: .debug, result.identifier)
                }
            default:
                break
        }
    }
    .disposed(by: disposeBag)

let imageRequestHandler = VNImageRequestHandler(cgImage: cgImage, orientation: .up, options: requestOptions)
do {
    try imageRequestHandler.rx.perform([mlRequest], with: cgImage) // NB you can easily pass the value along to the completion handler
} catch {
    print(error)
}

ios - Get the cvPixelBuffer used in a VNImageRequestHandler on the VNDetectTextRectanglesRequest completion handler

3 回答 3

Related

Reference