c++ - 使用 libav/ffmpeg 将 RGB8 转换为 NV12

Question

我正在尝试使用 libav 将输入 RGB8 图像转换为 NV12，但 sws_scale 引发了读取访问冲突。我一定是飞机或步幅不对，但我不明白为什么。

在这一点上，我相信我会受益于一双新鲜的眼睛。我错过了什么？


void convertRGB2NV12(unsigned char *rgb_in, width, height) {
 struct SwsContext* sws_context = nullptr;
 const int in_linesize[1] = {3 * width}; // RGB stride
 int out_linesize[2] = {width, width}; // NV12 stride

 // NV12 data is separated in two
 // planes, one for the intensity (Y) and another one for
 // the colours(UV) interleaved, both with
 // the same width as the frame but the UV plane with
 // half of its height.
 uint8_t* out_planes[2];
 out_planes[0] = new uint8_t[width * height];
 out_planes[1] = new uint8_t[width * height/2];

 sws_context = sws_getCachedContext(sws_context, width, height,
                                    AV_PIX_FMT_RGB8, width, height,
                                    AV_PIX_FMT_NV12, 0, 0, 0, 0);
 sws_scale(sws_context, (const uint8_t* const*)rgb_in, in_linesize,
           0, height, out_planes, out_linesize);
// (.....)
}

score 1 · Accepted Answer

主要有两个问题：

替换AV_PIX_FMT_RGB8为AV_PIX_FMT_RGB24。

rgb_in应该用指针数组“包装”：

 const uint8_t* in_planes[1] = {rgb_in};

 sws_scale(sws_context, in_planes, ...)

测试：

使用 FFmpeg 命令行工具创建 RGB24 像素格式的二进制输入：

ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt rgb24 -frames 1 -f rawvideo rgb_image.bin

使用 C 代码读取输入图像：

const int width = 192;
const int height = 108;
unsigned char* rgb_in = new uint8_t[width * height * 3];

FILE* f = fopen("rgb_image.bin", "rb");
fread(rgb_in, 1, width * height * 3, f);
fclose(f);

执行convertRGB2NV12(rgb_in, width, height);。

在函数结束之前，添加将输出写入二进制文件的临时代码：

FILE* f = fopen("nv12_image.bin", "wb");
fwrite(out_planes[0], 1, width * height, f);
fwrite(out_planes[1], 1, width * height/2, f);
fclose(f);

将 nv12_image.bin 作为灰度输入转换为 PNG 图像文件（用于查看结果）：

ffmpeg -y -f rawvideo -s 192x162 -pix_fmt gray -i nv12_image.bin -pix_fmt rgb24 nv12_image.png

完整的代码示例：

#include <stdio.h>
#include <string.h>
#include <stdint.h>

extern "C"
{
#include <libswscale/swscale.h>
}


void convertRGB2NV12(const unsigned char *rgb_in, int width, int height)
{
    struct SwsContext* sws_context = nullptr;
    const int in_linesize[1] = {3 * width}; // RGB stride
    const int out_linesize[2] = {width, width}; // NV12 stride

    // NV12 data is separated in two
    // planes, one for the intensity (Y) and another one for
    // the colours(UV) interleaved, both with
    // the same width as the frame but the UV plane with
    // half of its height.
    uint8_t* out_planes[2];
    out_planes[0] = new uint8_t[width * height];
    out_planes[1] = new uint8_t[width * height/2];

    sws_context = sws_getCachedContext(sws_context, width, height,
                                    AV_PIX_FMT_RGB24, width, height,
                                    AV_PIX_FMT_NV12, SWS_BILINEAR, nullptr, nullptr, nullptr);

    const uint8_t* in_planes[1] = {rgb_in};

    int response = sws_scale(sws_context, in_planes, in_linesize,
                             0, height, out_planes, out_linesize);

    if (response < 0)
    {
        printf("Error: sws_scale response = %d\n", response);
        return;
    }

// (.....)

    //Write NV12 output image to binary file (for testing)
    ////////////////////////////////////////////////////////////////////////////
    FILE* f = fopen("nv12_image.bin", "wb");
    fwrite(out_planes[0], 1, width * height, f);
    fwrite(out_planes[1], 1, width * height/2, f);
    fclose(f);
    ////////////////////////////////////////////////////////////////////////////


    delete[] out_planes[0];
    delete[] out_planes[1];

    sws_freeContext(sws_context);
}



int main()
{
    //Use ffmpeg for building raw RGB image (used as input).
    //ffmpeg -y -f lavfi -i testsrc=size=192x108:rate=1 -vcodec rawvideo -pix_fmt rgb24 -frames 1 -f rawvideo rgb_image.bin
    
    const int width = 192;
    const int height = 108;
    unsigned char* rgb_in = new uint8_t[width * height * 3];

    //Read input image for binary file (for testing)
    ////////////////////////////////////////////////////////////////////////////
    FILE* f = fopen("rgb_image.bin", "rb");
    fread(rgb_in, 1, width * height * 3, f);
    fclose(f);
    ////////////////////////////////////////////////////////////////////////////


    convertRGB2NV12(rgb_in, width, height);

    delete[] rgb_in;

    return 0;
}

输入（RGB）：

输出（NV12显示为灰度）：

将 NV12 转换为 RGB：

ffmpeg -y -f rawvideo -s 192x108 -pix_fmt nv12 -i nv12_image.bin -pix_fmt rgb24 rgb_output_image.png

结果：

score 0 · Accepted Answer

输入格式平面必须采用指针数组格式，正如Rotem在对原始帖子的第一条评论中所指出的那样。

c++ - 使用 libav/ffmpeg 将 RGB8 转换为 NV12

2 回答 2

测试：

Related

Reference