ios - 如何指定 AVAudioEngine Mic-Input 的格式？

Question

我想使用AVAudioEngine和用户麦克风录制一些音频。我已经有一个工作示例，但只是不知道如何指定我想要的输出格式......

我的要求是，AVAudioPCMBuffer正如我所说的那样，我需要它目前所做的......

我需要添加一个单独的节点来进行一些转码吗？我找不到太多关于该问题的文档/示例...

在音频方面，我也是个菜鸟。我知道我想要NSData包含最大采样率为 16000 的 PCM-16bit（8000 会更好）

这是我的工作示例：

private var audioEngine = AVAudioEngine()

func startRecording() {

  let format = audioEngine.inputNode!.inputFormatForBus(bus)

  audioEngine.inputNode!.installTapOnBus(bus, bufferSize: 1024, format: format) { (buffer: AVAudioPCMBuffer, time:AVAudioTime) -> Void in

     let audioFormat = PCMBuffer.format
     print("\(audioFormat)")
  }

  audioEngine.prepare()
  do {
     try audioEngine.start()
  } catch { /* Imagine some super awesome error handling here */ }
}

如果我改变格式让'说

let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.PCMFormatInt16, sampleRate: 8000.0, channels: 1, interleaved: false)

那么如果会产生一个错误，说采样率需要与 hwInput 相同...

很感谢任何形式的帮助！！！

编辑：我刚刚发现AVAudioConverter，但我也需要与 iOS8 兼容......

score 24 · Accepted Answer

您不能直接在输入或输出节点上更改音频格式。对于麦克风，格式始终为 44KHz、1 通道、32 位。为此，您需要在两者之间插入一个混音器。然后当你连接 inputNode > changeformatMixer > mainEngineMixer 时，你可以指定你想要的格式的细节。

就像是：

var inputNode = audioEngine.inputNode
var downMixer = AVAudioMixerNode()

//I think you the engine's I/O nodes are already attached to itself by default, so we attach only the downMixer here:
audioEngine.attachNode(downMixer)

//You can tap the downMixer to intercept the audio and do something with it:
downMixer.installTapOnBus(0, bufferSize: 2048, format: downMixer.outputFormatForBus(0), block:  //originally 1024
            { (buffer: AVAudioPCMBuffer!, time: AVAudioTime!) -> Void in
                print(NSString(string: "downMixer Tap"))
                do{
                    print("Downmixer Tap Format: "+self.downMixer.outputFormatForBus(0).description)//buffer.audioBufferList.debugDescription)

        })

//let's get the input audio format right as it is
let format = inputNode.inputFormatForBus(0)
//I initialize a 16KHz format I need:
let format16KHzMono = AVAudioFormat.init(commonFormat: AVAudioCommonFormat.PCMFormatInt16, sampleRate: 11050.0, channels: 1, interleaved: true)

//connect the nodes inside the engine:
//INPUT NODE --format-> downMixer --16Kformat--> mainMixer
//as you can see I m downsampling the default 44khz we get in the input to the 16Khz I want 
audioEngine.connect(inputNode, to: downMixer, format: format)//use default input format
audioEngine.connect(downMixer, to: audioEngine.outputNode, format: format16KHzMono)//use new audio format
//run the engine
audioEngine.prepare()
try! audioEngine.start()

不过，我建议使用开放式框架，例如 EZAudio。

score 3 · Accepted Answer

我发现唯一可以改变采样率的是

AVAudioSettings.sharedInstance().setPreferredSampleRate(...)

您可以点击 engine.inputNode 并使用输入节点的输出格式：

engine.inputNode.installTap(onBus: 0, bufferSize: 2048,
                            format: engine.inputNode.outputFormat(forBus: 0))

不幸的是，虽然 8000、12000、16000、22050、44100 似乎都有效，但无法保证您将获得所需的采样率。

以下没有工作：

在分接 engine.inputNode 中设置我的自定义格式。（例外）
添加具有我的自定义格式的混音器并点击它。（例外）
添加一个混音器，用 inputNode 的格式连接它，用我的自定义格式将混音器连接到主混音器，然后删除 outputNode 的输入，以免将音频发送到扬声器并获得即时反馈。（工作，但全为零）
在 AVAudioEngine 中根本不使用我的自定义格式，而是使用 AVAudioConverter 从我的水龙头中的硬件速率转换。（缓冲区的长度没有设置，无法判断结果是否正确）

这是使用 iOS 12.3.1。

score 1 · Accepted Answer

为了改变输入节点的采样率，你必须首先将输入节点连接到一个混合器节点，并在参数中指定一个新的格式。

let input = avAudioEngine.inputNode
let mainMixer = avAudioEngine.mainMixerNode
let newAudioFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: 44100, channels: 1, interleaved: true)
avAudioEngine.connect(input, to: mainMixer, format: newAudioFormat)

现在您可以使用 newAudioFormat 在输入节点上调用 installTap 函数。

还有一点我要指出的是，自从 iPhone12 新推出以来，输入节点的默认采样率已经不再是 44100。已经升级到48000了。

score 0 · Accepted Answer

您无法更改输入节点的配置，尝试使用您想要的格式创建一个混合器节点，将其附加到引擎，然后将其连接到输入节点，然后将 mainMixer 连接到您刚刚创建的节点。现在你可以在这个节点上安装一个水龙头来获取 PCM 数据。

请注意，由于一些奇怪的原因，您没有太多的采样率选择！至少在 iOS 9.1 上不行，使用标准 11025、22050 或 44100。任何其他采样率都会失败！

score 0 · Accepted Answer

如果您只需要更改采样率和通道，我建议使用行级 API。您不需要使用混音器或转换器。在这里您可以找到有关低级录音的 Apple 文档。如果需要，您将能够转换为 Objective-C 类并添加协议。

音频队列服务编程指南

score 0 · Accepted Answer

如果您的目标只是最终获得包含所需格式音频的 AVAudioPCMBuffers，您可以使用 AVAudioConverter 转换在 tap 块中返回的缓冲区。这样，您实际上不需要知道或关心 inputNode 的格式是什么。

class MyBufferRecorder {
    
    private let audioEngine:AVAudioEngine = AVAudioEngine()
    private var inputNode:AVAudioInputNode!
    private let audioQueue:DispatchQueue = DispatchQueue(label: "Audio Queue 5000")
    private var isRecording:Bool = false
    
    func startRecording() {
        
        if (isRecording) {
            return
        }
        isRecording = true
        
        // must convert (unknown until runtime) input format to our desired output format
        inputNode = audioEngine.inputNode
        let inputFormat:AVAudioFormat! = inputNode.outputFormat(forBus: 0)
    
        // 9600 is somewhat arbitrary... min seems to be 4800, max 19200... it doesn't matter what we set
        // because we don't re-use this value -- we query the buffer returned in the tap block for it's true length.
        // Using [weak self] in the tap block is probably a better idea, but it results in weird warnings for now
        inputNode.installTap(onBus: 0, bufferSize: AVAudioFrameCount(9600), format: inputFormat) { (buffer, time) in
            
            // not sure if this is necessary
            if (!self.isRecording) {
                print("\nDEBUG - rejecting callback, not recording")
                return }
            
            // not really sure if/why this needs to be async
            self.audioQueue.async {

                // Convert recorded buffer to our preferred format
                
                let convertedPCMBuffer = AudioUtils.convertPCMBuffer(bufferToConvert: buffer, fromFormat: inputFormat, toFormat: AudioUtils.desiredFormat)
            
                // do something with converted buffer
            }
        }
        do {
            // important not to start engine before installing tap
            try audioEngine.start()
        } catch {
            print("\nDEBUG - couldn't start engine!")
            return
        }
        
    }
    
    func stopRecording() {
        print("\nDEBUG - recording stopped")
        isRecording = false
        inputNode.removeTap(onBus: 0)
        audioEngine.stop()
    }
    
}

单独类：

import Foundation
import AVFoundation

// assumes we want 16bit, mono, 44100hz
// change to what you want
class AudioUtils {
    
    static let desiredFormat:AVAudioFormat! = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(44100), channels: 1, interleaved: false)
    
    // PCM <--> PCM
    static func convertPCMBuffer(bufferToConvert: AVAudioPCMBuffer, fromFormat: AVAudioFormat, toFormat: AVAudioFormat) -> AVAudioPCMBuffer {
        
        let convertedPCMBuffer = AVAudioPCMBuffer(pcmFormat: toFormat, frameCapacity: AVAudioFrameCount(bufferToConvert.frameLength))
        var error: NSError? = nil
        
        let inputBlock:AVAudioConverterInputBlock = {inNumPackets, outStatus in
            outStatus.pointee = AVAudioConverterInputStatus.haveData
            return bufferToConvert
        }
        let formatConverter:AVAudioConverter = AVAudioConverter(from:fromFormat, to: toFormat)!
        formatConverter.convert(to: convertedPCMBuffer!, error: &error, withInputFrom: inputBlock)
        
        if error != nil {
            print("\nDEBUG - " + error!.localizedDescription)
        }
        
        return convertedPCMBuffer!
        
    }
}

这绝不是生产就绪代码——我也在学习 IOS 音频......所以请让我知道该代码中发生的任何错误、最佳实践或危险事情，我会不断更新这个答案。

ios - 如何指定 AVAudioEngine Mic-Input 的格式？

6 回答 6

Related

Reference