audio - Normalising FFT data (FFTW)

Question

Using FFTW I have been computing the FFT of normalized .wav file data. I am a bit confused as to how I should normalise the FFT output, however. I have been using the method which seemed obvious to me, which is simply to divide by the highest FFT magnitude. I have seen division by 1/N and N/2 recommended, however (where I assume N = FFT size). How do these work as normalisation factors? There doesn't seem to me to be an intuitive relation between these factors and the actual data - so what am I missing?

Huge thanks in advance for any help on this.

score 5 · Accepted Answer

令人惊讶的是，FFT 和 IFFT 没有统一的定义，至少就缩放而言，但对于大多数实现（包括 FFTW），您需要在前向方向上缩放 1/N，并且没有缩放反方向。

通常（出于性能原因）您会希望将此缩放因子与任何其他校正（例如您的 A/D 增益、窗口增益校正因子等）混为一谈，这样您就只有一个组合缩放因子可应用于您的 FFT 输出垃圾箱。或者，如果您只是生成以 dB 为单位的功率谱，那么您可以将校正值设为从功率谱箱中减去的单个 dB 值。

score 4 · Accepted Answer

对于 FFT，参考Parseval 定理和其他需要有意义量级的比较通常很有用。此外，任何单个峰的高度都不是很有用，并且取决于例如计算 FFT 时使用的窗口，因为这可以缩短和加宽峰。出于这些原因，我建议不要按最大峰值进行归一化，因为这样您就失去了与有意义的幅度的任何简单联系，以及数据集之间的轻松比较等。

audio - Normalising FFT data (FFTW)

2 回答 2

Related

Reference