我正在从 iOS 上的 Unity 应用程序生成视频。我正在使用 iVidCap,它使用 AVFoundation 来执行此操作。那边一切正常。本质上,视频是通过使用纹理渲染目标并将帧传递给 Obj-C 插件来渲染的。
现在我需要在视频中添加音频。音频将是在特定时间发生的声音效果,也可能是一些背景声音。使用的文件实际上是 Unity 应用程序内部的资产。我可能会将这些写入手机存储,然后生成一个 AVComposition,但我的计划是避免这种情况并将音频合成为浮点格式缓冲区(从音频剪辑中获取音频为浮点格式)。稍后我可能会做一些即时音频效果。
几个小时后,我设法录制了音频并与视频一起播放……但它结结巴巴。
目前,我只是在每一帧视频的持续时间内生成一个方波并将其写入 AVAssetWriterInput。稍后,我将生成我真正想要的音频。
如果我生成一个大量样本,我就不会出现口吃。如果我把它写成块(我更喜欢分配一个庞大的数组),那么音频块似乎会相互剪辑:
我似乎无法弄清楚这是为什么。我很确定我得到了正确的音频缓冲区的时间戳,但也许我做错了整个部分。还是我需要一些标志来让视频同步到音频?我看不出这是问题所在,因为在将音频数据提取到 wav 后,我可以在波形编辑器中看到问题。
编写音频的相关代码:
- (id)init {
self = [super init];
if (self) {
// [snip]
rateDenominator = 44100;
rateMultiplier = rateDenominator / frameRate;
sample_position_ = 0;
audio_fmt_desc_ = nil;
int nchannels = 2;
AudioStreamBasicDescription audioFormat;
bzero(&audioFormat, sizeof(audioFormat));
audioFormat.mSampleRate = 44100;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = nchannels;
int bytes_per_sample = sizeof(float);
audioFormat.mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsAlignedHigh;
audioFormat.mBitsPerChannel = bytes_per_sample * 8;
audioFormat.mBytesPerPacket = bytes_per_sample * nchannels;
audioFormat.mBytesPerFrame = bytes_per_sample * nchannels;
CMAudioFormatDescriptionCreate(kCFAllocatorDefault,
&audioFormat,
0,
NULL,
0,
NULL,
NULL,
&audio_fmt_desc_
);
}
return self;
}
- (BOOL)beginRecordingSession {
NSError* error = nil;
isAborted = false;
abortCode = No_Abort;
// Allocate the video writer object.
videoWriter = [[AVAssetWriter alloc] initWithURL:[self getVideoFileURLAndRemoveExisting:
recordingPath] fileType:AVFileTypeMPEG4 error:&error];
if (error) {
NSLog(@"Start recording error: %@", error);
}
// Configure video compression settings.
NSDictionary* videoCompressionProps = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithDouble:1024.0 * 1024.0], AVVideoAverageBitRateKey,
[NSNumber numberWithInt:10],AVVideoMaxKeyFrameIntervalKey,
nil];
// Configure video settings.
NSDictionary* videoSettings = [NSDictionary dictionaryWithObjectsAndKeys:
AVVideoCodecH264, AVVideoCodecKey,
[NSNumber numberWithInt:frameSize.width], AVVideoWidthKey,
[NSNumber numberWithInt:frameSize.height], AVVideoHeightKey,
videoCompressionProps, AVVideoCompressionPropertiesKey,
nil];
// Create the video writer that is used to append video frames to the output video
// stream being written by videoWriter.
videoWriterInput = [[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:videoSettings] retain];
//NSParameterAssert(videoWriterInput);
videoWriterInput.expectsMediaDataInRealTime = YES;
// Configure settings for the pixel buffer adaptor.
NSDictionary* bufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kCVPixelFormatType_32ARGB], kCVPixelBufferPixelFormatTypeKey, nil];
// Create the pixel buffer adaptor, used to convert the incoming video frames and
// append them to videoWriterInput.
avAdaptor = [[AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:videoWriterInput sourcePixelBufferAttributes:bufferAttributes] retain];
[videoWriter addInput:videoWriterInput];
// <pb> Added audio input.
sample_position_ = 0;
AudioChannelLayout acl;
bzero( &acl, sizeof(acl));
acl.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
NSDictionary* audioOutputSettings = nil;
audioOutputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[ NSNumber numberWithInt: kAudioFormatMPEG4AAC ], AVFormatIDKey,
[ NSNumber numberWithInt: 2 ], AVNumberOfChannelsKey,
[ NSNumber numberWithFloat: 44100.0 ], AVSampleRateKey,
[ NSNumber numberWithInt: 64000 ], AVEncoderBitRateKey,
[ NSData dataWithBytes: &acl length: sizeof( acl ) ], AVChannelLayoutKey,
nil];
audioWriterInput = [[AVAssetWriterInput
assetWriterInputWithMediaType: AVMediaTypeAudio
outputSettings: audioOutputSettings ] retain];
//audioWriterInput.expectsMediaDataInRealTime = YES;
audioWriterInput.expectsMediaDataInRealTime = NO; // seems to work slightly better
[videoWriter addInput:audioWriterInput];
rateDenominator = 44100;
rateMultiplier = rateDenominator / frameRate;
// Add our video input stream source to the video writer and start it.
[videoWriter startWriting];
[videoWriter startSessionAtSourceTime:CMTimeMake(0, rateDenominator)];
isRecording = true;
return YES;
}
- (int) writeAudioBuffer:(float *)samples sampleCount:(size_t)n channelCount:(size_t)nchans {
if (![self waitForAudioWriterReadiness]) {
NSLog(@"WARNING: writeAudioBuffer dropped frame after wait limit reached.");
return 0;
}
//NSLog(@"writeAudioBuffer");
OSStatus status;
CMBlockBufferRef bbuf = NULL;
CMSampleBufferRef sbuf = NULL;
size_t buflen = n * nchans * sizeof(float);
// Create sample buffer for adding to the audio input.
status = CMBlockBufferCreateWithMemoryBlock(
kCFAllocatorDefault,
samples,
buflen,
kCFAllocatorNull,
NULL,
0,
buflen,
0,
&bbuf);
if (status != noErr) {
NSLog(@"CMBlockBufferCreateWithMemoryBlock error");
return -1;
}
CMTime timestamp = CMTimeMake(sample_position_, 44100);
sample_position_ += n;
status = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, bbuf, TRUE, 0, NULL, audio_fmt_desc_, 1, timestamp, NULL, &sbuf);
if (status != noErr) {
NSLog(@"CMSampleBufferCreate error");
return -1;
}
BOOL r = [audioWriterInput appendSampleBuffer:sbuf];
if (!r) {
NSLog(@"appendSampleBuffer error");
}
CFRelease(bbuf);
CFRelease(sbuf);
return 0;
}
关于发生了什么的任何想法?
我应该以不同的方式创建/附加样本吗?
这与AAC压缩有关吗?如果我尝试使用未压缩的音频(它会抛出),它就不起作用。
据我所知,我正在正确计算 PTS。为什么音频通道甚至需要这个?视频不应该与音频时钟同步吗?
更新
我尝试以 1024 个样本的固定块提供音频,因为这是 AAC 压缩器使用的 DCT 的大小。没有任何区别。
在编写任何视频之前,我曾尝试一次性推动所有块。不工作。
我尝试将 CMSampleBufferCreate 用于剩余块,并将 CMAudioSampleBufferCreateWithPacketDescriptions 仅用于第一个块。没变。
我已经尝试过这些组合。还是不对。
解决方案
看起来:
audioWriterInput.expectsMediaDataInRealTime = YES;
是必不可少的,否则它会扰乱它的思想。也许这是因为视频是用这个标志设置的。此外,CMBlockBufferCreateWithMemoryBlock
即使您将标志传递kCMBlockBufferAlwaysCopyDataFlag
给它,也不会复制样本数据。
因此,可以使用它创建一个缓冲区,然后使用CMBlockBufferCreateContiguous
它进行复制,以确保您获得一个带有音频数据副本的块缓冲区。否则它会引用你最初传入的内存,事情就会变得一团糟。