A quick-and-dirty audio sample mixing technique to avoid clipping

最新推荐文章于 2025-04-24 01:00:00 发布

翻译最新推荐文章于 2025-04-24 01:00:00 发布 · 501 阅读

iOS_音视频专栏收录该内容

71 篇文章

订阅专栏

本文探讨了在iOS设备上正确混合两个音频数字信号的方法，避免了因简单叠加导致的音量溢出问题。通过介绍一种特别的混合算法，使得混合后的音频既能保持原有音量水平又能在有限的动态范围内表现良好。

http://atastypixel.com/blog/how-to-mix-audio-samples-properly-on-ios/

两个音频数字信号的混合

In the real world, when you hear two sounds at once, what you’re hearing is the combination (in the “+” sense) of the two noises. If you put five hundred drummers in the same room and, avoiding the obvious drummer jokes for now, told them all to play, you’d get drummer 1 + drummer 2 + … + drummer 500 (also bleeding ears).

With digital audio though, the volume doesn’t go up to oh-god-please-make-them-stop – it’s limited to a small dynamic range.

Naïve mixing, with overflow

So, digital mixing actually requires a little thought in order to avoid overflowing these bounds and clipping. I recently came across this when writing some mixing routines for my upcoming app Loopy 2, and found a very useful discussion on mixing digital audio by software developer and author Viktor Toth.

The basic concept is to mix in such a way that we stay within the dynamic range of the target audio format, while representing the dynamics of the mixed signals as faithfully as possible.Note that a simple average of the samples (as in, (sample 1 + sample 2) / 2) won’t accomplish this – for example, if sample 1 is silent, whilesample 2 is happily jamming away, sample 2 will be halved in volume.

Instead, we want to meet three goals – assuming signed audio samples, the standard format for Remote IO/audio units on the iPhone/iPad, which can range from negative, through to zero (silence), up to positive values.

If both samples are positive, we mix them so that the output value is somewhere between the maximum value of the two samples, and the maximum possible value
If both samples are negative, we mix them so that the output value is somewhere between the minimum value of the two samples, and the minimum possible value
If one sample is positive, and one is negative, we want them to cancel out somewhat

If we’re talking about signed samples, MIN…0…MAX, this does the trick:

Mixing equation

This lets the volume level for both samples remain the same, while fitting within the available range.

Improved mixing

Here’s how it’s done on iOS:

SInt16 *bufferA, SInt16 *bufferB;
NSInteger bufferLength;
SInt16 *outputBuffer;
 
for ( NSInteger i=0; i<bufferLength; i++ ) {
  if ( bufferA[i] < 0 && bufferB[i] < 0 ) {
    // If both samples are negative, mixed signal must have an amplitude between 
    // the lesser of A and B, and the minimum permissible negative amplitude
    outputBuffer[i] = (bufferA[i] + bufferB[i]) - ((bufferA[i] * bufferB[i])/INT16_MIN);
  } else if ( bufferA[i] > 0 && bufferB[i] > 0 ) {
    // If both samples are positive, mixed signal must have an amplitude between the greater of
    // A and B, and the maximum permissible positive amplitude
    outputBuffer[i] = (bufferA[i] + bufferB[i]) - ((bufferA[i] * bufferB[i])/INT16_MAX);
  } else {
    // If samples are on opposite sides of the 0-crossing, mixed signal should reflect 
    // that samples cancel each other out somewhat
    outputBuffer[i] = bufferA[i] + bufferB[i];
  }
}

Update: A reader recently demonstrated that this technique can introduce some unpleasant distortion with certain kinds of input — as the algorithm is nonlinear, some distortion is inevitable (see the sharp points on the waveform where the condition switches over). For the kind of audio I’m mixing, the results seem to be perfectly adequate, but this may not be generally true.

Update 2: Here’s an inline function I put together for neatness:

inline SInt16 TPMixSamples(SInt16 a, SInt16 b) {
    return  
            // If both samples are negative, mixed signal must have an amplitude between the lesser of A and B, and the minimum permissible negative amplitude
            a < 0 && b < 0 ?
                ((int)a + (int)b) - (((int)a * (int)b)/INT16_MIN) :
 
            // If both samples are positive, mixed signal must have an amplitude between the greater of A and B, and the maximum permissible positive amplitude
            ( a > 0 && b > 0 ?
                ((int)a + (int)b) - (((int)a * (int)b)/INT16_MAX)
 
            // If samples are on opposite sides of the 0-crossing, mixed signal should reflect that samples cancel each other out somewhat
            :
                a + b);
}

but someone say this is wrong

Sigh

Posted June 12, 2013 at 3:28 pm | Permalink

This is so terribly wrong. Please don’t mislead newbies into thinking that this is the correct way to mix two channels. The correct way is to simply sum/average them together, as you dismissed early in the article.

Summing/averaging is exactly what every professional analog or digital mixing console does, because it’s exactly what happens in the air and in our ears and in our brains. Yes, it can change the crest factor of the signal, but that’s ok because digital audio is designed to have lots of headroom for the peaks above the normal signal level that you listen at. You’re not generating audio at 0 dBFS are you? Surely you know better than that. :D

If you want to participate in the Loudness War and harshly reduce the dynamic range of your mix til everything is at 11 all the time, use a locally-linear limiter, not this nonlinear distortion stuff.