java - 用于标准化音频的 Java 算法

Question

我正在尝试规范化语音的音频文件。

具体来说，当音频文件包含音量峰值时，我试图将其拉平，因此安静的部分更响亮，峰值更安静。

除了从这项任务中学到的知识之外，我对音频处理知之甚少。另外，我的数学很差。

我做了一些研究，Xuggle 网站提供了一个示例，显示使用以下代码减少音量：（完整版在这里）

@Override
  public void onAudioSamples(IAudioSamplesEvent event)
{
  // get the raw audio byes and adjust it's value 

  ShortBuffer buffer = event.getAudioSamples().getByteBuffer().asShortBuffer();
  for (int i = 0; i < buffer.limit(); ++i)
    buffer.put(i, (short)(buffer.get(i) * mVolume));

  super.onAudioSamples(event);
}

在这里，他们getAudioSamples()通过一个常量来修改字节mVolume。

getAudioSamples()在这种方法的基础上，考虑到文件中的最大值/最小值，我尝试了规范化将字节修改为规范化值。（详见下文）。我有一个简单的过滤器来单独留下“沉默”（即任何低于值的东西）。

我发现输出文件非常嘈杂（即质量严重下降）。我假设错误是在我的规范化算法中，或者是我操作字节的方式。但是，我不确定下一步该去哪里。

这是我目前正在做的事情的精简版。

第 1 步：在文件中查找峰：

buffer.get()读取完整的音频文件，并找到所有 AudioSamples的最高和最低值

    @Override
    public void onAudioSamples(IAudioSamplesEvent event) {
        IAudioSamples audioSamples = event.getAudioSamples();
        ShortBuffer buffer = 
           audioSamples.getByteBuffer().asShortBuffer();

        short min = Short.MAX_VALUE;
        short max = Short.MIN_VALUE;
        for (int i = 0; i < buffer.limit(); ++i) {
            short value = buffer.get(i);
            min = (short) Math.min(min, value);
            max = (short) Math.max(max, value);
        }
        // assign of min/max ommitted for brevity.
        super.onAudioSamples(event);

    }

第 2 步：标准化所有值：

在类似于 step1 的循环中，用标准化值替换缓冲区，调用：

    buffer.put(i, normalize(buffer.get(i));

public short normalize(short value) {
    if (isBackgroundNoise(value))
        return value;

    short rawMin = // min from step1
    short rawMax = // max from step1
    short targetRangeMin = 1000;
    short targetRangeMax = 8000;

    int abs = Math.abs(value);
    double a = (abs - rawMin) * (targetRangeMax - targetRangeMin);
    double b = (rawMax - rawMin);
    double result = targetRangeMin + ( a/b );

     // Copy the sign of value to result.
    result = Math.copySign(result,value);
    return (short) result;
}

问题：

这是尝试规范化音频文件的有效方法吗？
我的数学normalize()有效吗？
为什么这会导致文件变得嘈杂，而演示代码中的类似方法却没有？

score 9 · Accepted Answer

I don't think the concept of "minimum sample value" is very meaningful, since the sample value just represents the current "height" of the sound wave at a certain time instant. I.e. its absolute value will vary between the peak value of the audio clip and zero. Thus, having a targetRangeMin seems to be wrong and will probably cause some distortion of the waveform.

I think a better approach might be to have some sort of weight function that decreases the sample value based on its size. I.e. bigger values are decreased by a large percentage than smaller values. This would also introduce some distortion, but probably not very noticeable.

Edit: here is a sample implementation of such a method:

public short normalize(short value) {
    short rawMax = // max from step1
    short targetMax = 8000;

    //This is the maximum volume reduction
    double maxReduce = 1 - targetMax/(double)rawMax;

    int abs = Math.abs(value);
    double factor = (maxReduce * abs/(double)rawMax);

    return (short) Math.round((1 - factor) * value); 
}

For reference, this is what your algorithm did to a sine curve with an amplitude of 10000: Original algorithm

This explains why the audio quality becomes much worse after being normalized.

This is the result after running with my suggested normalize method: Suggested algorithm

score 5 · Accepted Answer

音频的“标准化”是增加音频电平的过程，使得最大值等于某个给定值，通常是最大可能值。今天，在另一个问题中，有人解释了如何做到这一点（见＃1）：音频音量正常化

但是，您继续说“具体来说，当音频文件包含音量峰值时，我正在尝试将其拉平，因此安静的部分更响亮，峰值更安静。” 这称为“压缩”或“限制”（不要与 MP3 编码中使用的压缩类型混淆！）。您可以在此处阅读更多相关信息：http ://en.wikipedia.org/wiki/Dynamic_range_compression

一个简单的压缩器并不是特别难实现，但你说你的数学“非常弱”。所以你可能想找到一个已经建成的。您也许可以找到在http://sox.sourceforge.net/中实现的压缩器，并将其从 C 转换为 Java。我知道谁的源代码可用的压缩器的唯一 java 实现（而且它不是很好）在这本书中

作为解决您的问题的替代方法，您可以将文件以每秒 1/2 的片段标准化，然后使用线性插值连接您用于每个片段的增益值。您可以在此处阅读有关音频的线性插值：http: //blog.bjornroche.com/2010/10/linear-interpolation-for-audio-in-cc.html

我不知道levelator的源代码是否可用，但您可以尝试其他方法。

java - 用于标准化音频的 Java 算法

第 1 步：在文件中查找峰：

第 2 步：标准化所有值：

问题：

2 回答 2

Related

Reference