macos - 使用 Apple Video Toolkit 进行 H264 解码

Question

我正在尝试使用 Apple Video Toolbox 和 OpenH264 的组合让 H264 流媒体应用程序在各种平台上运行。有一个用例不起作用，我找不到任何解决方案。当源在运行 MacOS High Sierra 的 2011 iMac 上使用视频工具箱并且接收器是运行 Big Sur 的 MacBook Pro 时。

在接收器上，解码后的图像大约是 3/4 绿色。如果我在编码之前将图像缩小到原始图像的 1/8 左右，那么它可以正常工作。如果我在 MacBook 上捕获帧，然后在 iMac 上的测试程序中运行完全相同的解码软件，那么它可以很好地解码。在 Macbook 上做同样的事情（测试程序的相同图像）再次给出 3/4 绿色。在较慢的 Windows 机器上从 OpenH264 编码器接收时，我遇到了类似的问题。我怀疑这与时间处理有关，但对 H264 的理解还不够好，无法解决。我确实注意到的一件事是，解码调用在大约 70% 的时间返回时没有错误代码，而是返回 NULL 像素缓冲区。

解码部分的“胆量”是这样的（改编自GitHub上的demo）

void didDecompress(void *decompressionOutputRefCon, void *sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CVImageBufferRef pixelBuffer, CMTime presentationTimeStamp, CMTime presentationDuration )
{
    CVPixelBufferRef *outputPixelBuffer = (CVPixelBufferRef *)sourceFrameRefCon;
    *outputPixelBuffer = CVPixelBufferRetain(pixelBuffer);
}

 void initVideoDecodeToolBox ()
    {
        if (!decodeSession)
        {
            const uint8_t* parameterSetPointers[2] = { mSPS, mPPS };
            const size_t parameterSetSizes[2] = { mSPSSize, mPPSSize };
            OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault,2, //param count
                                                                                  parameterSetPointers,
                                                                                  parameterSetSizes,
                                                                                  4, //nal start code size
                                                                                  &formatDescription);
            if(status == noErr)
            {
                CFDictionaryRef attrs = NULL;
                const void *keys[] = { kCVPixelBufferPixelFormatTypeKey, kVTDecompressionPropertyKey_RealTime };
                uint32_t v = kCVPixelFormatType_32BGRA;
                const void *values[] = { CFNumberCreate(NULL, kCFNumberSInt32Type, &v), kCFBooleanTrue };
                attrs = CFDictionaryCreate(NULL, keys, values, 2, NULL, NULL);
                VTDecompressionOutputCallbackRecord callBackRecord;
                callBackRecord.decompressionOutputCallback = didDecompress;
                callBackRecord.decompressionOutputRefCon = NULL;
                status = VTDecompressionSessionCreate(kCFAllocatorDefault, formatDescription, NULL, attrs, &callBackRecord, &decodeSession);
                CFRelease(attrs);
            }
            else
            {
                NSLog(@"IOS8VT: reset decoder session failed status=%d", status);
            }
        }
    }

CVPixelBufferRef decode ( const char *NALBuffer, size_t NALSize )
    {
        CVPixelBufferRef outputPixelBuffer = NULL;
        if (decodeSession && formatDescription )
        {
            // The NAL buffer has been stripped of the NAL length data, so this has to be put back in
            MemoryBlock buf ( NALSize + 4);
            memcpy ( (char*)buf.getData()+4, NALBuffer, NALSize );
            *((uint32*)buf.getData()) = CFSwapInt32HostToBig ((uint32)NALSize);
            
            CMBlockBufferRef blockBuffer = NULL;
            OSStatus status  = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault, buf.getData(), NALSize+4,kCFAllocatorNull,NULL, 0, NALSize+4, 0, &blockBuffer);
            
            if(status == kCMBlockBufferNoErr)
            {
                CMSampleBufferRef sampleBuffer = NULL;
                const size_t sampleSizeArray[] = {NALSize + 4};
                status = CMSampleBufferCreateReady(kCFAllocatorDefault,blockBuffer,formatDescription,1, 0, NULL, 1, sampleSizeArray,&sampleBuffer);
                
                if (status == kCMBlockBufferNoErr && sampleBuffer)
                {
                    VTDecodeFrameFlags flags = 0;VTDecodeInfoFlags flagOut = 0;
                    
                    // The default is synchronous operation.
                    // Call didDecompress and call back after returning.
                    OSStatus decodeStatus = VTDecompressionSessionDecodeFrame ( decodeSession, sampleBuffer, flags, &outputPixelBuffer, &flagOut );

                    if(decodeStatus != noErr)
                    {
                        DBG ( "decode failed status=" + String ( decodeStatus) );
                    }
                    CFRelease(sampleBuffer);
                }
                CFRelease(blockBuffer);
            }
        }
        return outputPixelBuffer;
    }

注意：NAL 块没有 00 00 00 01 分隔符，因为它们在具有显式长度字段的块中流式传输。

解码在所有平台上都可以正常工作，并且编码流使用 OpenH264 可以很好地解码。

score 1 · Accepted Answer

好吧，我终于找到了答案，所以我将把它留在这里以供后代使用。事实证明，Video Toolkit 解码函数希望将所有属于同一帧的 NAL 块复制到单个 SampleBuffer 中。较旧的 Mac 为应用程序提供单个关键帧，这些关键帧被拆分为单独的 NAL 块，然后应用程序通过网络单独发送这些块。不幸的是，这意味着第一个 NAL 块将被处理，可能不到图片的四分之一，其余的将被丢弃。您需要做的是确定哪些 NAL 是同一框架的一部分，并将它们捆绑在一起。不幸的是，这需要您部分解析 PPS 和帧本身，这并非易事。非常感谢Apple 网站上的帖子，它让我走上了正轨。

macos - 使用 Apple Video Toolkit 进行 H264 解码

1 回答 1

Related

Reference