A quick-and-dirty audio sample mixing technique to avoid clipping

本文探讨了在iOS设备上正确混合两个音频数字信号的方法,避免了因简单叠加导致的音量溢出问题。通过介绍一种特别的混合算法,使得混合后的音频既能保持原有音量水平又能在有限的动态范围内表现良好。

http://atastypixel.com/blog/how-to-mix-audio-samples-properly-on-ios/

两个音频数字信号的混合

In the real world, when you hear two sounds at once, what you’re hearing is the combination (in the “+” sense) of the two noises. If you put five hundred drummers in the same room and, avoiding the obvious drummer jokes for now, told them all to play, you’d get drummer 1 + drummer 2 + … + drummer 500 (also bleeding ears).

With digital audio though, the volume doesn’t go up to oh-god-please-make-them-stop – it’s limited to a small dynamic range.

Naïve mixing, with overflow

So, digital mixing actually requires a little thought in order to avoid overflowing these bounds and clipping. I recently came across this when writing some mixing routines for my upcoming app Loopy 2, and found a very useful discussion on mixing digital audio by software developer and author Viktor Toth.

The basic concept is to mix in such a way that we stay within the dynamic range of the target audio format, while representing the dynamics of the mixed signals as faithfully as possible.Note that a simple average of the samples (as in, (sample 1 + sample 2) / 2) won’t accomplish this – for example, if sample 1 is silent, whilesample 2 is happily jamming away, sample 2 will be halved in volume.

Instead, we want to meet three goals – assuming signed audio samples, the standard format for Remote IO/audio units on the iPhone/iPad, which can range from negative, through to zero (silence), up to positive values.

  1. If both samples are positive, we mix them so that the output value is somewhere between the maximum value of the two samples, and the maximum possible value
  2. If both samples are negative, we mix them so that the output value is somewhere between the minimum value of the two samples, and the minimum possible value
  3. If one sample is positive, and one is negative, we want them to cancel out somewhat

If we’re talking about signed samples, MIN…0…MAX, this does the trick:

Mixing equation

This lets the volume level for both samples remain the same, while fitting within the available range.

Improved mixing

Here’s how it’s done on iOS:

SInt16 *bufferA, SInt16 *bufferB;
NSInteger bufferLength;
SInt16 *outputBuffer;
 
for ( NSInteger i=0; i<bufferLength; i++ ) {
  if ( bufferA[i] < 0 && bufferB[i] < 0 ) {
    // If both samples are negative, mixed signal must have an amplitude between 
    // the lesser of A and B, and the minimum permissible negative amplitude
    outputBuffer[i] = (bufferA[i] + bufferB[i]) - ((bufferA[i] * bufferB[i])/INT16_MIN);
  } else if ( bufferA[i] > 0 && bufferB[i] > 0 ) {
    // If both samples are positive, mixed signal must have an amplitude between the greater of
    // A and B, and the maximum permissible positive amplitude
    outputBuffer[i] = (bufferA[i] + bufferB[i]) - ((bufferA[i] * bufferB[i])/INT16_MAX);
  } else {
    // If samples are on opposite sides of the 0-crossing, mixed signal should reflect 
    // that samples cancel each other out somewhat
    outputBuffer[i] = bufferA[i] + bufferB[i];
  }
}


Update: A reader recently demonstrated that this technique can introduce some unpleasant distortion with certain kinds of input — as the algorithm is nonlinear, some distortion is inevitable (see the sharp points on the waveform where the condition switches over). For the kind of audio I’m mixing, the results seem to be perfectly adequate, but this may not be generally true.

Update 2: Here’s an inline function I put together for neatness:

inline SInt16 TPMixSamples(SInt16 a, SInt16 b) {
    return  
            // If both samples are negative, mixed signal must have an amplitude between the lesser of A and B, and the minimum permissible negative amplitude
            a < 0 && b < 0 ?
                ((int)a + (int)b) - (((int)a * (int)b)/INT16_MIN) :
 
            // If both samples are positive, mixed signal must have an amplitude between the greater of A and B, and the maximum permissible positive amplitude
            ( a > 0 && b > 0 ?
                ((int)a + (int)b) - (((int)a * (int)b)/INT16_MAX)
 
            // If samples are on opposite sides of the 0-crossing, mixed signal should reflect that samples cancel each other out somewhat
            :
                a + b);
}

but someone say this is wrong

Sigh
Posted June 12, 2013 at 3:28 pm  |  Permalink

This is so terribly wrong. Please don’t mislead newbies into thinking that this is the correct way to mix two channels. The correct way is to simply sum/average them together, as you dismissed early in the article.

Summing/averaging is exactly what every professional analog or digital mixing console does, because it’s exactly what happens in the air and in our ears and in our brains. Yes, it can change the crest factor of the signal, but that’s ok because digital audio is designed to have lots of headroom for the peaks above the normal signal level that you listen at. You’re not generating audio at 0 dBFS are you? Surely you know better than that. :D

If you want to participate in the Loudness War and harshly reduce the dynamic range of your mix til everything is at 11 all the time, use a locally-linear limiter, not this nonlinear distortion stuff.


AudioCocoaiPhone. Bookmark the  permalink. Both comments and trackbacks are currently closed.










评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值