h.264 - 如何将来自相机的输入图像编码为 H.264 流？

Question

我正在尝试使用 Mac OS X 10.9.5 上的 libx264 将来自 MacBook Pro 内置 FaceTime 高清摄像头的输入图像实时编码为 H.264 视频流。

以下是我采取的步骤：

使用 AVFoundation API（AVCaptureDevice 类等）以 15fps 从相机获取 1280x720 32BGRA 图像
使用 libswscale 将图像转换为 320x180 YUV420P 格式。
使用 libx264 将图像编码为 H.264 视频流（基线配置文件）。

每次从相机获取图像时，我都会应用上述步骤，相信编码器会跟踪编码状态并在可用时生成 NAL 单元。

由于我想在向编码器提供输入图像的同时获取编码帧，因此我决定每 30 帧（2 秒）刷新一次编码器（调用 x264_encoder_delayed_frames()）。

但是，当我重新开始编码时，编码器会在一段时间后停止（x264_encoder_encode() 永远不会返回。）我尝试在刷新之前更改帧数，但情况没有改变。

下面是相关代码（我省略了图像捕获代码，因为它看起来没有问题。）

你能指出我可能做错了什么吗？

x264_t *encoder;
x264_param_t param;

// Will be called only first time.
int initEncoder() {
  int ret;

  if ((ret = x264_param_default_preset(&param, "medium", NULL)) < 0) {
    return ret;
  }

  param.i_csp = X264_CSP_I420;
  param.i_width  = 320;
  param.i_height = 180;
  param.b_vfr_input = 0;
  param.b_repeat_headers = 1;
  param.b_annexb = 1;

  if ((ret = x264_param_apply_profile(&param, "baseline")) < 0) {
    return ret;
  }

  encoder = x264_encoder_open(&param);
  if (!encoder) {
    return AVERROR_UNKNOWN;
  }

  return 0;
}

// Will be called from encodeFrame() defined below.
int convertImage(const enum AVPixelFormat srcFmt, const int srcW, const int srcH, const uint8_t *srcData, const enum AVPixelFormat dstFmt, const int dstW, const int dstH, x264_image_t *dstData) {
  struct SwsContext *sws_ctx;
  int ret;
  int src_linesize[4];
  uint8_t *src_data[4];

  sws_ctx = sws_getContext(srcW, srcH, srcFmt,
                       dstW, dstH, dstFmt,
                       SWS_BILINEAR, NULL, NULL, NULL);

  if (!sws_ctx) {
    return AVERROR_UNKNOWN;
  }

  if ((ret = av_image_fill_linesizes(src_linesize, srcFmt, srcW)) < 0) {
    sws_freeContext(sws_ctx);
    return ret;
  }

  if ((ret = av_image_fill_pointers(src_data, srcFmt, srcH, (uint8_t *) srcData, src_linesize)) < 0) {
    sws_freeContext(sws_ctx);
    return ret;
  }

  sws_scale(sws_ctx, (const uint8_t * const*)src_data, src_linesize, 0, srcH, dstData->plane, dstData->i_stride);
  sws_freeContext(sws_ctx);
  return 0;
}

// Will be called for each frame.
int encodeFrame(const uint8_t *data, const int width, const int height) {
  int ret;
  x264_picture_t pic;
  x264_picture_t pic_out;
  x264_nal_t *nal;
  int i_nal;

  if ((ret = x264_picture_alloc(&pic, param.i_csp, param.i_width, param.i_height)) < 0) {
    return ret;
  }

  if ((ret = convertImage(AV_PIX_FMT_RGB32, width, height, data, AV_PIX_FMT_YUV420P, 320, 180, &pic.img)) < 0) {
    x264_picture_clean(&pic);
    return ret;
  }

  if ((ret = x264_encoder_encode(encoder, &nal, &i_nal, &pic, &pic_out)) < 0) {
    x264_picture_clean(&pic);
    return ret;
  }

  if(ret) {
    for (int i = 0; i < i_nal; i++) {
      printNAL(nal + i);
    }
  }

  x264_picture_clean(&pic);
  return 0;
}

// Will be called every 30 frames.
int flushEncoder() {
  int ret;
  x264_nal_t *nal;
  int i_nal;
  x264_picture_t pic_out;

  /* Flush delayed frames */
  while (x264_encoder_delayed_frames(encoder)) {
    if ((ret = x264_encoder_encode(encoder, &nal, &i_nal, NULL, &pic_out)) < 0) {
      return ret;
    }

    if (ret) {
      for (int j = 0; j < i_nal; j++) {
        printNAL(nal + j);
      }
    }
  }
}

score 1 · Accepted Answer

您不应该在每一帧之后刷新延迟帧，而只在没有更多输入帧时刷新一次，即在编码结束时。

h.264 - 如何将来自相机的输入图像编码为 H.264 流？

1 回答 1

Related

Reference