8

我对信号处理几乎一无所知,目前我正在尝试在 Swift 中实现一个函数,当声压级增加时(例如,当人类尖叫时)触发一个事件。

我正在使用这样的回调进入 AVAudioEngine 的输入节点:

let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat){
 (buffer : AVAudioPCMBuffer?, when : AVAudioTime) in 
    let arraySize = Int(buffer.frameLength)
    let samples = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count:arraySize))

   //do something with samples
    let volume = 20 * log10(floatArray.reduce(0){ $0 + $1} / Float(arraySize))
    if(!volume.isNaN){
       print("this is the current volume: \(volume)")
    }
}

将其转换为浮点数组后,我尝试通过计算平均值来粗略估计声压级。

但这给了我很多波动的值,即使 iPad 只是坐在一个安静的房间里:

this is the current volume: -123.971
this is the current volume: -119.698
this is the current volume: -147.053
this is the current volume: -119.749
this is the current volume: -118.815
this is the current volume: -123.26
this is the current volume: -118.953
this is the current volume: -117.273
this is the current volume: -116.869
this is the current volume: -110.633
this is the current volume: -130.988
this is the current volume: -119.475
this is the current volume: -116.422
this is the current volume: -158.268
this is the current volume: -118.933

如果我在麦克风附近拍手,这个值确实会显着增加。

所以我可以做一些事情,比如在准备阶段首先计算这些体积的平均值,然后比较在事件触发阶段差异是否显着增加:

 if(!volume.isNaN){
    if(isInThePreparingPhase){
        print("this is the current volume: \(volume)")
        volumeSum += volume
        volumeCount += 1
     }else if(isInTheEventTriggeringPhase){
         if(volume > meanVolume){
             //triggers an event
         }
      }
 }

其中 averageVolume 在从准备阶段到触发事件阶段的过渡期间计算:meanVolume = volumeSum / Float(volumeCount)

……

但是,如果我在麦克风之外播放响亮的音乐,似乎没有显着增加。并且在极少数情况下,即使在环境没有显着增加的情况下(人耳可以听到),它的音量也会volume更大。meanVolume

那么从 AVAudioPCMBuffer 中提取声压级的正确方法是什么?

维基百科给出了这样的公式

math!

p 为均方根声压,p0 为参考声压。

但我不知道浮点值AVAudioPCMBuffer.floatChannelData代表什么。苹果页面只说

缓冲区的音频样本作为浮点值。

我应该如何与他们合作?

4

2 回答 2

6

我认为第一步是获得声音的包络。您可以使用简单的平均来计算包络,但您需要添加一个校正步骤(通常意味着使用 abs() 或 square() 使所有样本为正)

更常见的是使用简单的 iir 滤波器而不是平均,具有不同的攻击和衰减常数,这里是一个实验室。请注意,这些常数取决于采样频率,您可以使用此公式来计算常数:

1 - exp(-timePerSample*2/smoothingTime)

第2步

When you have the envelope, you can smooth it with an additional filter, and then compare the two envelopes to find a sound that is louder than the baselevel, here's a more complete lab.

Note that detecting audio "events" can be quite tricky, and hard to predict, make sure you have a lot of debbugging aid!

于 2016-10-14T05:34:09.833 回答
6

Thanks to the response from @teadrinker I finally find out a solution for this problem. I share my Swift code that outputs the volume of the AVAudioPCMBuffer input:

private func getVolume(from buffer: AVAudioPCMBuffer, bufferSize: Int) -> Float {
    guard let channelData = buffer.floatChannelData?[0] else {
        return 0
    }

    let channelDataArray = Array(UnsafeBufferPointer(start:channelData, count: bufferSize))

    var outEnvelope = [Float]()
    var envelopeState:Float = 0
    let envConstantAtk:Float = 0.16
    let envConstantDec:Float = 0.003

    for sample in channelDataArray {
        let rectified = abs(sample)

        if envelopeState < rectified {
            envelopeState += envConstantAtk * (rectified - envelopeState)
        } else {
            envelopeState += envConstantDec * (rectified - envelopeState)
        }
        outEnvelope.append(envelopeState)
    }

    // 0.007 is the low pass filter to prevent
    // getting the noise entering from the microphone
    if let maxVolume = outEnvelope.max(),
        maxVolume > Float(0.015) {
        return maxVolume
    } else {
        return 0.0
    }
}
于 2018-05-08T15:49:42.150 回答