5

我正在通过 v4l 从相机中获取视频帧,我需要将它们转码为 mpeg4 格式,以便通过 RTP 连续流式传输它们。

一切实际上都“有效”,但在重新编码时我没有做到:输入流产生 15fps,而输出为 25fps,并且每个输入帧都转换为一个视频对象序列(我通过简单的检查验证了这一点在输出比特流上)。我猜接收器正在正确解析 mpeg4 比特流,但 RTP 打包在某种程度上是错误的。我应该如何将编码的比特流拆分为一个或多个 AVPacket ?也许我错过了明显的东西,我只需要寻找 B/P 帧标记,但我认为我没有正确使用编码 API。

这是我的代码摘录,它基于可用的 ffmpeg 示例:

// input frame
AVFrame *picture;
// input frame color-space converted
AVFrame *planar;
// input format context, video4linux2
AVFormatContext *iFmtCtx;
// output codec context, mpeg4
AVCodecContext *oCtx;
// [ init everything ]
// ...
oCtx->time_base.num = 1;
oCtx->time_base.den = 25;
oCtx->gop_size = 10;
oCtx->max_b_frames = 1;
oCtx->bit_rate = 384000;
oCtx->pix_fmt = PIX_FMT_YUV420P;

for(;;)
{
  // read frame
  rdRes = av_read_frame( iFmtCtx, &pkt );
  if ( rdRes >= 0 && pkt.size > 0 )
  {
    // decode it
    iCdcCtx->reordered_opaque = pkt.pts;
    int decodeRes = avcodec_decode_video2( iCdcCtx, picture, &gotPicture, &pkt );
    if ( decodeRes >= 0 && gotPicture )
    {
      // scale / convert color space
      avpicture_fill((AVPicture *)planar, planarBuf.get(), oCtx->pix_fmt, oCtx->width, oCtx->height);
      sws_scale(sws, picture->data, picture->linesize, 0, iCdcCtx->height, planar->data, planar->linesize);
      // encode
      ByteArray encBuf( 65536 );
      int encSize = avcodec_encode_video( oCtx, encBuf.get(), encBuf.size(), planar );
      // this happens every GOP end
      while( encSize == 0 )
        encSize = avcodec_encode_video( oCtx, encBuf.get(), encBuf.size(), 0 );
      // send the transcoded bitstream with the result PTS
      if ( encSize > 0 )
        enqueueFrame( oCtx->coded_frame->pts, encBuf.get(), encSize );
    }
  }
}
4

1 回答 1

0

最简单的解决方案是使用两个线程。第一个线程将完成您问题中概述的所有事情(解码、缩放/色彩空间转换、编码)。部分转码的帧将被写入与第二个线程共享的中间队列。在这种特殊情况下(从较低比特率转换为较高比特率),该队列的最大长度为 1 帧。第二个线程将从输入队列中读取循环帧,如下所示:

void FpsConverter::ThreadProc()
{

timeBeginPeriod(1);
DWORD start_time = timeGetTime();
int frame_counter = 0;
while(!shouldFinish()) {
    Frame *frame = NULL;
    DWORD time_begin = timeGetTime();
    ReadInputFrame(frame);
    WriteToOutputQueue(frame);
    DWORD time_end = timeGetTime();
    DWORD next_frame_time = start_time + ++frame_counter * frame_time;
    DWORD time_to_sleep = next_frame_time - time_end;
    if (time_to_sleep > 0) {
        Sleep(time_to_sleep);
    }
}
timeEndPeriod(1);
}

When CPU power is sufficient and higher fidelity and smoothness is required you could compute output frame not just from one frame but more frames by some sort of interpolation (similar to techniques used in mpeg codecs). The closer output frame time stamp to input frame time stamp, the more weight you should assign to this particular input frame.

于 2011-07-28T20:06:54.870 回答