c - 了解 FFMPEG 视频编码

Question

从 ffmpeg 中的编码示例中得到这个。我可以在某种程度上遵循作者的音频编码示例，但我发现自己在看 C 代码时感到困惑（我在块号中进行了评论以帮助我参考我在说什么）......

static void video_encode_example(const char *filename)
{
AVCodec *codec;
AVCodecContext *c= NULL;
int i, out_size, size, x, y, outbuf_size;
FILE *f;
AVFrame *picture;
uint8_t *outbuf, *picture_buf;              //BLOCK ONE
printf("Video encoding\n");

/* find the mpeg1 video encoder */
codec = avcodec_find_encoder(CODEC_ID_MPEG1VIDEO);
if (!codec) {
    fprintf(stderr, "codec not found\n");
    exit(1);                                //BLOCK TWO
}

c= avcodec_alloc_context();
picture= avcodec_alloc_frame();
/* put sample parameters */
c->bit_rate = 400000;
/* resolution must be a multiple of two */
c->width = 352;
c->height = 288;
/* frames per second */
c->time_base= (AVRational){1,25};
c->gop_size = 10; /* emit one intra frame every ten frames */
c->max_b_frames=1;
c->pix_fmt = PIX_FMT_YUV420P;                   //BLOCK THREE

/* open it */
if (avcodec_open(c, codec) < 0) {
    fprintf(stderr, "could not open codec\n");
    exit(1);
}
f = fopen(filename, "wb");
if (!f) {
    fprintf(stderr, "could not open %s\n", filename);
    exit(1);
}                                               //BLOCK FOUR

/* alloc image and output buffer */
outbuf_size = 100000;
outbuf = malloc(outbuf_size);
size = c->width * c->height;
picture_buf = malloc((size * 3) / 2); /* size for YUV 420 */
picture->data[0] = picture_buf;
picture->data[1] = picture->data[0] + size;
picture->data[2] = picture->data[1] + size / 4;
picture->linesize[0] = c->width;
picture->linesize[1] = c->width / 2;
picture->linesize[2] = c->width / 2;              //BLOCK FIVE

/* encode 1 second of video */
for(i=0;i<25;i++) {
    fflush(stdout);
    /* prepare a dummy image */
    /* Y */
    for(y=0;y<c->height;y++) {
        for(x=0;x<c->width;x++) {
            picture->data[0][y * picture->linesize[0] + x] = x + y + i * 3;
        }
    }                                            //BLOCK SIX

    /* Cb and Cr */
    for(y=0;y<c->height/2;y++) {
        for(x=0;x<c->width/2;x++) {
            picture->data[1][y * picture->linesize[1] + x] = 128 + y + i * 2;
            picture->data[2][y * picture->linesize[2] + x] = 64 + x + i * 5;
        }
    }                                           //BLOCK SEVEN

    /* encode the image */
    out_size = avcodec_encode_video(c, outbuf, outbuf_size, picture);
    printf("encoding frame %3d (size=%5d)\n", i, out_size);
    fwrite(outbuf, 1, out_size, f);
}                                              //BLOCK EIGHT

/* get the delayed frames */
for(; out_size; i++) {
    fflush(stdout);
    out_size = avcodec_encode_video(c, outbuf, outbuf_size, NULL);
    printf("write frame %3d (size=%5d)\n", i, out_size);
    fwrite(outbuf, 1, out_size, f);
}                                             //BLOCK NINE

/* add sequence end code to have a real mpeg file */
outbuf[0] = 0x00;
outbuf[1] = 0x00;
outbuf[2] = 0x01;
outbuf[3] = 0xb7;
fwrite(outbuf, 1, 4, f);
fclose(f);
free(picture_buf);
free(outbuf);
avcodec_close(c);
av_free(c);
av_free(picture);
}                                            //BLOCK TEN

这是我可以从作者的代码中逐块获得的...

块一：初始化变量和指针。我在 ffmpeg 源代码中还找不到 AVFrame 结构，所以我不知道它的引用是什么

块二：使用文件中的编解码器，如果未找到关闭。

块三：设置示例视频参数。我唯一没有真正得到的是gop尺寸。我阅读了有关帧内帧的信息，但我仍然不明白它们是什么。

第四块：打开文件进行写入...

第五块：这是他们真正开始失去我的地方。部分可能是因为我不确切知道 AVFrame 是什么，但为什么他们只使用图像大小的 3/2？

第六和第七块：我不明白他们想用这个数学来完成什么。

块八：看起来 avcodec 函数完成了这里的所有工作，暂时不关心这个。

块九：因为它在循环的 25 帧之外，我假设它得到了剩余的帧？

第十块：关闭，释放内存等......

我知道这是一大块要混淆的代码，任何输入都会有所帮助。我在工作中不知所措。在此先感谢 SO。

score 4 · Accepted Answer

正如 HonkyTonk 已经回复的那样，评论说明了这一点：准备一个虚拟图像。我猜您可能会对虚拟图像的生成方式感到困惑，尤其是在您不熟悉 YUV/YCbCr 颜色空间的情况下。阅读 Wikipedia 处理以了解基础知识。

许多视频编解码器在 YUV 颜色空间中运行。这常常让只习惯于处理 RGB 的程序员感到困惑。执行摘要是，对于这种变化（YUV 4:2:0 平面），图像中的每个像素都有一个 Y 样本（注意 Y 循环遍历每个 (x,y) 对），而每个 2x2 像素四边形共享一个 U/Cb 样本和一个 V/Cr 样本（注意块 7 中的迭代超过了宽度/2 和高度/2）。

看起来生成的图案是某种渐变。如果要产生已知变化，请将 Y/Cb/Cr 设置为 0，虚拟图像将全为绿色。将Cb和Cr设置为128，将Y设置为255，得到一个白框；将 Y 滑动到 0 以查看黑色；将 Y 设置为介于两者之间的任何值，同时将 Cb 和 Cr 保持在 128 以查看灰色阴影。

score 4 · Accepted Answer

我分享我的理解 [安静的迟到的回复！]

YUV420p：

YUV 420P 或 YCbCr 是 RGB repersetation 的替代品，它包含 3

平面，即 Y（亮度分量）U（Y-Cb）和 V（Y-Cr）分量。[ans Y-Cb-Cr-Cg =

常数，我们不需要存储Cg分量，因为它通常可以计算出来。]就像RGB888，每个像素需要3个字节，YUV420每个像素需要1.5个字节[@Find(How

12位用于表示什么比例）] 这里P代表Progressive，表示帧是渐进的，意味着V跟随U，U跟随Y和YUV Frame是一个字节数组，简单！另一个是 I - 代表交错，意味着 UV 平面数据以特定方式在 Y 平面数据之间交错[@Find(What way)]

c - 了解 FFMPEG 视频编码

2 回答 2

YUV420p：

Related

Reference