1

我想通过多路复用来自麦克风的音频(覆盖 didGetAudioData)和来自摄像头的视频(覆盖 onpreviewframe)来生成 mp4 文件。但是,我遇到了音视频同步问题,视频会出现比音频快。我想知道问题是否与不兼容的配置或presentationTimeUs有关,有人可以指导我如何解决这个问题。下面是我的软件。

视频配置

formatVideo = MediaFormat.createVideoFormat(MIME_TYPE_VIDEO, 640, 360);
formatVideo.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420SemiPlanar);
formatVideo.setInteger(MediaFormat.KEY_BIT_RATE, 2000000);
formatVideo.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
formatVideo.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5);

得到如下视频演示PTS,

if(generateIndex == 0) {
    videoAbsolutePtsUs = 132;
    StartVideoAbsolutePtsUs = System.nanoTime() / 1000L;
}else {
    CurrentVideoAbsolutePtsUs = System.nanoTime() / 1000L;
    videoAbsolutePtsUs =132+ CurrentVideoAbsolutePtsUs-StartVideoAbsolutePtsUs;
}
generateIndex++;

音频配置

format = MediaFormat.createAudioFormat(MIME_TYPE, 48000/*sample rate*/, AudioFormat.CHANNEL_IN_MONO /*Channel config*/);
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE,48000);
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT,1);
format.setInteger(MediaFormat.KEY_BIT_RATE,64000);

得到如下音频演示PTS,

if(generateIndex == 0) {
   audioAbsolutePtsUs = 132;
   StartAudioAbsolutePtsUs = System.nanoTime() / 1000L;
}else {
   CurrentAudioAbsolutePtsUs = System.nanoTime() / 1000L;
   audioAbsolutePtsUs =CurrentAudioAbsolutePtsUs - StartAudioAbsolutePtsUs;
}

generateIndex++;
audioAbsolutePtsUs = getJitterFreePTS(audioAbsolutePtsUs, audioInputLength / 2);

long startPTS = 0;
long totalSamplesNum = 0;
private long getJitterFreePTS(long bufferPts, long bufferSamplesNum) {
    long correctedPts = 0;
    long bufferDuration = (1000000 * bufferSamplesNum) / 48000;
    bufferPts -= bufferDuration; // accounts for the delay of acquiring the audio buffer
    if (totalSamplesNum == 0) {
        // reset
        startPTS = bufferPts;
        totalSamplesNum = 0;
    }
    correctedPts = startPTS +  (1000000 * totalSamplesNum) / 48000;
    if(bufferPts - correctedPts >= 2*bufferDuration) {
        // reset
        startPTS = bufferPts;
        totalSamplesNum = 0;
        correctedPts = startPTS;
    }
    totalSamplesNum += bufferSamplesNum;
    return correctedPts;
}

我的问题是仅对音频应用抖动功能引起的吗?如果是,我如何为视频应用抖动功能?我还尝试通过https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java找到正确的音频和视频演示文稿PTS . 但是 encodedecodeTest 只提供了视频 PTS。这就是我的实现将系统纳米时间用于音频和视频的原因。如果我想在encodedecodetest中使用视频presentationPTS,如何构建兼容的音频presentationPTS?感谢帮助!

以下是我如何将 yuv 帧排队到视频媒体编解码器以供参考。对于音频部分,除了presentationPTS不同外,其他都是相同的。

int videoInputBufferIndex;
int videoInputLength;
long videoAbsolutePtsUs;
long StartVideoAbsolutePtsUs, CurrentVideoAbsolutePtsUs;

int put_v =0;
int get_v =0;
int generateIndex = 0;

public void setByteBufferVideo(byte[] buffer, boolean isUsingFrontCamera, boolean Input_endOfStream){
    if(Build.VERSION.SDK_INT >=18){
        try{

            endOfStream = Input_endOfStream;
            if(!Input_endOfStream){
            ByteBuffer[] inputBuffers = mVideoCodec.getInputBuffers();
            videoInputBufferIndex = mVideoCodec.dequeueInputBuffer(-1);

                if (VERBOSE) {
                    Log.w(TAG,"[put_v]:"+(put_v)+"; videoInputBufferIndex = "+videoInputBufferIndex+"; endOfStream = "+endOfStream);
                }

                if(videoInputBufferIndex>=0) {
                    ByteBuffer inputBuffer = inputBuffers[videoInputBufferIndex];
                    inputBuffer.clear();

                    inputBuffer.put(mNV21Convertor.convert(buffer));
                    videoInputLength = buffer.length;

                    if(generateIndex == 0) {
                        videoAbsolutePtsUs = 132;
                        StartVideoAbsolutePtsUs = System.nanoTime() / 1000L;
                    }else {
                        CurrentVideoAbsolutePtsUs = System.nanoTime() / 1000L;
                        videoAbsolutePtsUs =132+ CurrentVideoAbsolutePtsUs - StartVideoAbsolutePtsUs;
                    }

                    generateIndex++;

                    if (VERBOSE) {
                        Log.w(TAG, "[put_v]:"+(put_v)+"; videoAbsolutePtsUs = " + videoAbsolutePtsUs + "; CurrentVideoAbsolutePtsUs = "+CurrentVideoAbsolutePtsUs);
                    }

                    if (videoInputLength == AudioRecord.ERROR_INVALID_OPERATION) {
                        Log.w(TAG, "[put_v]ERROR_INVALID_OPERATION");
                    } else if (videoInputLength == AudioRecord.ERROR_BAD_VALUE) {
                        Log.w(TAG, "[put_v]ERROR_ERROR_BAD_VALUE");
                    }
                    if (endOfStream) {
                        Log.w(TAG, "[put_v]:"+(put_v++)+"; [get] receive endOfStream");
                        mVideoCodec.queueInputBuffer(videoInputBufferIndex, 0, videoInputLength, videoAbsolutePtsUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
                    } else {
                        Log.w(TAG, "[put_v]:"+(put_v++)+"; receive videoInputLength :" + videoInputLength);
                        mVideoCodec.queueInputBuffer(videoInputBufferIndex, 0, videoInputLength, videoAbsolutePtsUs, 0);
                    }
                }
            }
        }catch (Exception x) {
            x.printStackTrace();
        }
    }
}
4

1 回答 1

0

我如何在我的应用程序中解决这个问题是通过将所有视频和音频帧的 PTS 设置为一个共享的“同步时钟”(注意同步也意味着它是线程安全的),该时钟从第一个视频帧(其 PTS 0自己)可用。因此,如果音频录制比视频开始得早,则音频数据会在视频开始之前被解除(不进入编码器),如果它开始得晚,那么第一个音频 PTS 将相对于整个视频的开始。

当然,您可以自由地让音频先开始,但玩家通常会跳过或等待第一个视频帧。还要注意编码的音频帧将“乱序”到达,MediaMuxer 迟早会因错误而失败。我的解决方案是像这样将它们全部排队:当有新的进来时按 pts 对它们进行排序,然后将所有超过 500 毫秒(相对于最新的)的内容写入 MediaMuxer,但只有那些 PTS 高于最新的书面框架。理想情况下,这意味着数据以 500 毫秒的延迟顺利写入 MediaMuxer。最坏的情况是,您会丢失一些音频帧。

于 2016-03-22T00:02:05.453 回答