10

This is actually more of a theoretical question, but here it goes:

I'm developing an effect audio unit and it needs an equal power crossfade between dry and wet signals.

But I'm confused about the right way to do the mapping function from the linear fader to the scaling factor (gain) for the signal amplitudes of dry and wet streams.

Basically, I'ev seen it done with cos / sin functions or square roots... essentially approximating logarithmic curves. But if our perception of amplitude is logarithmic to start with, shouldn't these curves mapping the fader position to an amplitude actually be exponential?

This is what I mean:

Assumptions:

  • signal[i] means the ith sample in a signal.
  • each sample is a float ranging [-1, 1] for amplitudes between [0,1].
  • our GUI control is an NSSlider ranging from [0,1], so it is in principle linear.
  • fader is a variable with the value of the NSSlider.

First Observation: We perceive amplitude in a logarithmic way. So if we have a linear fader and merely adjust a signal's amplitude by doing: signal[i] * fader what we are perceiving (hearing, regardless of the math) is something along the lines of:

enter image description here

This is the so-called crappy fader-effect: we go from silence to a drastic volume increase across the leftmost segment in the slider and past the middle the volume doesn't seem to get that louder.

So to do the fader "right", we instead either express it in a dB scale and then, as far as the signal is concerned, do: signal[i] * 10^(fader/20) or, if we were to keep or fader units in [0,1], we can do :signal[i] * (.001*10^(3*fader))

Either way, our new mapping from the NSSlider to the fader variable which we'll use for multiplying in our code, looks like this now:

enter image description here

Which is what we actually want, because since we perceive amplitude logarithmically, we are essentially mapping from linear (NSSLider range 0-1) to exponential and feeding this exponential output to our logarithmic perception. And it turns out that : log(10^x)=x so we end up perceiving the amplitude change in a linear (aka correct) way.

Great.

Now, my thought is that an equal-power crossfade between two signals (in this case a dry / wet horizontal NSSlider to mix together the input to the AU and the processed output from it) is essentially the same only that with one slider acting on both hypothetical signals dry[i] and wet[i].

So If my slider ranges from 0 to 100 and dry is full-left and wet is full-right), I'd end up with code along the lines of:

Float32 outputSample, wetSample, drySample = <assume proper initialization>
Float32 mixLevel = .01 * GetParameter(kParameterTypeMixLevel);
Float32 wetPowerLevel = .001 * pow(10, (mixLevel*3)); 
Float32 dryPowerLevel = .001 * pow(10, ((-3*mixLevel)+1));
outputSample = (wetSample * wetPowerLevel) + (drySample * dryPowerLevel);

The graph of which would be:

enter image description here

And same as before, because we perceive amplitude logarithmically, this exponential mapping should actually make it where we hear the crossfade as linear.

However, I've seen implementations of the crossfade using approximations to log curves. Meaning, instead:

enter image description here

But wouldn't these curves actually emphasize our logarithmic perception of amplitude?

4

1 回答 1

9

您正在考虑的“等功率”交叉渐变与在您从湿渐变到干时保持混音的总输出功率恒定有关。保持总功率恒定作为保持总感知响度恒定的合理近似值(实际上这可能相当复杂)。

如果您在两个功率相等的不相关信号之间进行交叉淡入淡出,您可以通过使用平方值总和为 1 的任意两个函数在交叉淡入淡出期间保持恒定的输出功率。这方面的一个常见示例是函数集

g1(k) = ( 0.5 + 0.5*cos(pi*k) )^.5

g2(k) = ( 0.5 - 0.5*cos(pi*k) )^.5,

其中 0 <= k <= 1(请注意,如前所述,满足 g1(k)^2 + g2(k)^2 = 1)。这是一个证明,这会导致不相关信号的恒定功率交叉淡入淡出:

假设我们有两个信号 x1(t) 和 x2(t) 具有相等的幂 E[ x1(t)^2 ] = E[ x2(t)^2 ] = Px,它们也是不相关的 ( E[ x1(t) *x2(t)] = 0)。请注意,满足先前条件的任何一组增益函数都将具有 g2(k) = (1 - g1(k)^2)^.5。现在,形成总和 y(t) = g1(k)*x1(t) + g2(k)*x2(t),我们有:

E[ y(t)^2 ] = E[ (g1(k) * x1(t))^2  +  2*g1(k)*(1 - g1(k)^2)^.5 * x1(t) * x2(t)  +  (1 - g1(k)^2) * x2(t)^2 ] 
= g1(k)^2 * E[ x1(t)^2 ] + 2*g1(k)*(1 - g1(k)^2)^.5 * E[ x1(t)*x2(t) ] + (1 - g1(k)^2) * E[ x2(t)^2 ]
= g1(k)^2 * Px + 0 + (1 - g1(k)^2) * Px = Px,

其中我们使用了 g1(k) 和 g2(k) 是确定性的,因此可以被拉到期望算子 E[ ] 之外,并且根据定义, E[ x1(t)*x2(t) ] = 0 因为 x1( t) 和 x2(t) 被假定为不相关。这意味着无论我们在淡入淡出的哪个位置(无论我们选择什么 k),我们的输出都将具有相同的功率 Px,因此希望与感知响度相同。

请注意,对于完全相关的信号,您可以通过执行“线性”衰落来实现恒定的输出功率 - 使用和两个函数之和为一个( g1(k) + g2(k) = 1 )。当混合有些相关的信号时,这两者之间的增益函数在理论上是合适的。

你说的时候在想什么

和以前一样,因为我们以对数方式感知幅度,所以这个指数映射实际上应该使它在我们听到交叉淡入淡出的地方成为线性的。

是一个信号的响度应该在感知上作为滑块位置 (k) 的线性函数而降低,而另一个信号应该作为滑块位置的线性函数在感知上响度增加,当应用您导出的交叉淡入淡出时。虽然您的推导似乎很准确,但不幸的是,这可能不是混合干湿信号一致性的最佳方式 - 通常,无论滑块位置如何,保持相同的输出响度是更好的选择。无论如何,可能值得尝试几个不同的功能,看看什么是最有用和最一致的。

于 2012-07-01T06:42:42.583 回答