2

我正在接收作为 Int16s 的 16 位/48 kHz 立体声 PCM 样本流,我正在尝试使用 AVAudioEngine 播放它们,但是我什么也没听到。我认为这要么与我设置播放器的方式有关,要么与我将数据推送到缓冲区的方式有关。

我已经阅读了很多关于使用音频队列服务的替代解决方案的信息,但是我能找到的所有示例代码都在 Objective-C 或 iOS-only 中。

如果我有任何类型的 frameSize 问题或其他任何问题,我不应该至少还能听到扬声器发出的垃圾吗?

这是我的代码:


import Foundation
import AVFoundation

class VoicePlayer {
    
    var engine: AVAudioEngine
    
    let format = AVAudioFormat(commonFormat: AVAudioCommonFormat.pcmFormatInt16, sampleRate: 48000.0, channels: 2, interleaved: true)!
    let playerNode: AVAudioPlayerNode!
    var audioSession: AVCaptureSession = AVCaptureSession()
    
    init() {
        
        self.audioSession = AVCaptureSession()
        
        self.engine = AVAudioEngine()
        self.playerNode = AVAudioPlayerNode()
        
        self.engine.attach(self.playerNode)
        //engine.connect(self.playerNode, to: engine.mainMixerNode, format:AVAudioFormat.init(standardFormatWithSampleRate: 48000, channels: 2))
        /* If I set my custom format here, AVFoundation complains about the format not being available */
        engine.connect(self.playerNode, to: engine.outputNode, format:AVAudioFormat.init(standardFormatWithSampleRate: 48000, channels: 2))
        engine.prepare()
        try! engine.start()
        self.playerNode.play()
        
    }
    
    
    
    
    func play(buffer: [Int16]) {
        let interleavedChannelCount = 2
        let frameLength = buffer.count / interleavedChannelCount
        let audioBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameLength))!
        print("audio buffer size in frames is \(AVAudioFrameCount(frameLength))")
        // buffer contains 2 channel interleaved data
        // audioBuffer contains 2 channel interleaved data
        var buf = buffer
        let size = MemoryLayout<Int16>.stride * interleavedChannelCount * frameLength
        
        
        memcpy(audioBuffer.mutableAudioBufferList.pointee.mBuffers.mData, &buf, size)
        audioBuffer.frameLength = AVAudioFrameCount(frameLength)
        
        /* Implemented an AVAudioConverter for testing
         Input: 16 bit PCM 48kHz stereo interleaved
         Output: whatever the standard format for the system is
         
         Maybe this is somehow needed as my audio interface doesn't directly support 16 bit audio and can only run at 24 bit?
         */
         let normalBuffer = AVAudioPCMBuffer(pcmFormat: AVAudioFormat.init(standardFormatWithSampleRate: 48000, channels: 2)!, frameCapacity: AVAudioFrameCount(frameLength))
         normalBuffer?.frameLength = AVAudioFrameCount(frameLength)
         let converter = AVAudioConverter(from: format, to: AVAudioFormat.init(standardFormatWithSampleRate: 48000, channels: 2)!)
         var gotData = false
         
         let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
         
         if gotData {
         outStatus.pointee = .noDataNow
         return nil
         }
         gotData = true
         outStatus.pointee = .haveData
         return audioBuffer
         }
         
         var error: NSError? = nil
         let status: AVAudioConverterOutputStatus = converter!.convert(to: normalBuffer!, error: &error, withInputFrom: inputBlock);
         
        // Play the output buffer, in this case the audioBuffer, otherwise the normalBuffer
        // Playing the raw audio buffer causes an EXEC_BAD_ACCESS on playback, playing back the buffer from the converter doesn't, but it still doesn't sound anything like a human voice
        self.playerNode.scheduleBuffer(audioBuffer) {
        print("Played")
        }
        
        
    }
    
    
}

任何帮助将不胜感激。

4

2 回答 2

1

我有一个非常相似的问题,我以这篇文章为起点设法解决了它。区别:我使用 float 而不是 int 数据。我有一些 C 代码从网络接收音频数据并将其写入循环缓冲区的头部(我使用TPCircularBuffer)。然后,我的 Swift 代码从同一个循环缓冲区的尾部读取数据并将其复制AVMAudioPCMBufferplayernode.schedulebuffer(). 我使用了一个以这种方式获取下一个数据的函数,调用 schedulebuffer 并将自身作为 schedulebuffer 回调传递,因此数据不断被调度。我使用两个这样的调度线程得到了最好的结果。

否则情况是一样的。我的输入数据是交错的,就像你的情况一样。在我看来,交错音频不受AVAudioEngine. 如果您将交错立体声指定为输入格式,engine.connect()您将获得指向setFormat(). 相反,如果您像上面所做的那样指定非交错格式,那么您基本上是在向框架错误地告知它应该处理的数据,并且您将收到 EXEC_BAD_ACCESS 错误。

为了解决这个问题,我首先尝试将数据从循环缓冲区复制到交错的AVAudioPCMBuffer. 然后我用来AVAudioConverter制作一个新的去交错AVAudioPCMBuffer,然后我安排播放。现在我终于有音频了。起初它只是一系列随机失真的咔哒声,可能与您在转换后听到的相似。在我的例子中,我将问题归结为一个简单的错误计算:我一直将样本数除以numChannelsMemoryLayout<Float>.stride计算字节数,而不是乘以三。所以我只听到每个数据包的第 64 个。修复后我听到了实际的语音,但代码太慢了,无法跟上实时数据的到达。我最终在一次操作中复制和转换数据,然后扔掉AVAudioConverter并手动进行。在下面查看我的代码。

关于您的代码还有两点:

  1. 我发现AVAudioPCMBuffer通过buffer.floatChannelData(或在您的情况下buffer.int16ChannelData)访问数据比使用更方便buffer.mutableAudioBufferList.pointee.mBuffers.mData
  2. 我不确定你为什么使用AVCaptureSession你的音频会话。也许您有充分的理由,但我个人使用AVAudioSession.sharedInstance()andaudioSession.setCategory(.playback, mode: .spokenAudio, policy: .longForm)来播放语音,效果很好。

这是工作代码的相关部分:

    init() {
        do {
            if #available(iOS 11.0, *) {
                try audioSession.setCategory(.playback, mode: .spokenAudio, policy: .longForm)
            } else {
                try audioSession.setCategory(.playback, mode: .spokenAudio)
            }
        } catch {
            print("Failed to set audio session category. Error: \(error)")
        }
        engine = AVAudioEngine()
        playerNode = AVAudioPlayerNode()
        outputFormat = AVAudioFormat(standardFormatWithSampleRate: sampleRate, channels: numChannels)!
        circularBuffer = TPCircularBuffer()
        
        engine.attach(playerNode)
        engine.connect(playerNode, to: engine.outputNode, format: outputFormat)
        engine.prepare()
    }

    func start() {
        isPlayRequested = true
        _TPCircularBufferInit(&circularBuffer, bufferLength, MemoryLayout<TPCircularBuffer>.stride)
        networkStream_start(&circularBuffer) //this starts a loop in C code
        do {
            try audioSession.setActive(true)
        } catch {
            print("Failed to start audio session. Error: \(error)")
        }
        do {
            try engine.start()
        } catch {
            print("Failed to start audio engine. Error: \(error)")
        }

        for _ in 1...numSchedulers {
            scheduleNextData()
        }
        playerNode.play()
    }
    
    func getAndDeinterleaveNextData () -> AVAudioPCMBuffer {
        let inputBufferTail = TPCircularBufferTail(&circularBuffer, &availableBytes)
        let outputBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: bufferLength)!
        if inputBufferTail != nil {
            let sampleCount = Int(availableBytes / numSchedulers / floatSize)
            let tailFloatPointer = inputBufferTail!.bindMemory(to: Float.self, capacity: sampleCount)
            for channel in 0..<Int(numChannels) {
                for sampleIndex in 0..<sampleCount {
                    outputBuffer.floatChannelData![channel][sampleIndex] = tailFloatPointer[sampleIndex * Int(numChannels) + channel]
                }
            }
            outputBuffer.frameLength = AVAudioFrameCount(sampleCount / Int(numChannels))
            TPCircularBufferConsume(&circularBuffer, outputBuffer.frameLength * numChannels * floatSize)
        }
        return outputBuffer
    }
    
    func scheduleNextData() {
        if isPlayRequested {
            let outputBuffer = getAndDeinterleaveNextData()
            playerNode.scheduleBuffer(outputBuffer, completionHandler: scheduleNextData)
        }
    }
于 2022-01-12T09:41:23.647 回答
1

将数据复制到 an后,AVAudioPCMBuffer您需要设置其frameLength属性以指示它包含多少有效音频。

func play(buffer: [Int16]) {
    let interleavedChannelCount = 2
    let frameLength = buffer.count / interleavedChannelCount
    let audioBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: AVAudioFrameCount(frameLength))!

    // buffer contains 2 channel interleaved data
    // audioBuffer contains 2 channel interleaved data

    var buf = buffer
    memcpy(audioBuffer.mutableAudioBufferList.pointee.mBuffers.mData, &buf, MemoryLayout<Int16>.stride * interleavedChannelCount * frameLength)

    audioBuffer.frameLength = AVAudioFrameCount(frameLength)

    self.playerNode.scheduleBuffer(audioBuffer) {
        print("Played")
    }
}

编辑:更新了对问题的更改。旧的,(现在)不相关的部分:

部分问题是您的格式不一致。format被声明为非交错的,但buffer它是一个单一数组,Int16因此可能代表交错数据。直接复制一个到另一个可能是不正确的。

于 2020-12-25T18:23:25.247 回答