SOLA 算法数学原理
1. 基本数学模型
设两个音频段:
- 历史缓冲区:h[n],n=0,1,...,Lh−1h[n], n = 0, 1, ..., L_h-1h[n],n=0,1,...,Lh−1
- 当前帧:x[n],n=0,1,...,Lx−1x[n], n = 0, 1, ..., L_x-1x[n],n=0,1,...,Lx−1
其中重叠区域长度为 LoverlapL_{overlap}Loverlap。
2. 互相关函数计算
寻找最佳重叠位置的互相关函数:
R[k]=∑n=0Loverlap−1h[Lh−Loverlap+n]⋅x[k+n]R[k] = \sum_{n=0}^{L_{overlap}-1} h[L_h - L_{overlap} + n] \cdot x[k + n]R[k]=n=0∑Loverlap−1h[Lh−Loverlap+n]⋅x[k+n]
其中 kkk 是搜索偏移量,k=0,1,...,Lsearch−1k = 0, 1, ..., L_{search}-1k=0,1,...,Lsearch−1
3. 归一化互相关(避免幅度影响)
R^[k]=R[k]∑n=0Loverlap−1h2[Lh−Loverlap+n]⋅∑n=0Loverlap−1x2[k+n]\hat{R}[k] = \frac{R[k]}{\sqrt{\sum_{n=0}^{L_{overlap}-1} h^2[L_h - L_{overlap} + n] \cdot \sum_{n=0}^{L_{overlap}-1} x^2[k + n]}}R^[k]=∑n=0Loverlap−1h2[Lh−Loverlap+n]⋅∑n=0Loverlap−1x2[k+n]R[k]
4. 最佳偏移位置
kopt=argmaxkR^[k]k_{opt} = \arg\max_k \hat{R}[k]kopt=argkmaxR^[k]
5. 交叉淡化处理
交叉淡化窗口函数满足能量守恒:
wout[n]+win[n]=1,n=0,1,...,Loverlap−1w_{out}[n] + w_{in}[n] = 1, \quad n = 0, 1, ..., L_{overlap}-1wout[n]+win[n]=1,n=0,1,...,Loverlap−1
常用窗口函数:
5.1 汉宁窗 (Hann)
wout[n]=0.5×(1−cos(πnLoverlap−1))w_{out}[n] = 0.5 \times \left(1 - \cos\left(\frac{\pi n}{L_{overlap}-1}\right)\right)wout[n]=0.5×(1−cos(Loverlap−1πn))
win[n]=0.5×(1+cos(πnLoverlap−1))w_{in}[n] = 0.5 \times \left(1 + \cos\left(\frac{\pi n}{L_{overlap}-1}\right)\right)win[n]=0.5×(1+cos(Loverlap−1πn))
5.2 正弦窗
wout[n]=sin(π2×Loverlap−1−nLoverlap−1)w_{out}[n] = \sin\left(\frac{\pi}{2} \times \frac{L_{overlap}-1-n}{L_{overlap}-1}\right)wout[n]=sin(2π×Loverlap−1Loverlap−1−n)
win[n]=sin(π2×nLoverlap−1)w_{in}[n] = \sin\left(\frac{\pi}{2} \times \frac{n}{L_{overlap}-1}\right)win[n]=sin(2π×Loverlap−1n)
6. 输出信号合成
混合后的重叠区域:
y[n]=h[Lh−Loverlap+n]×wout[n]+x[kopt+n]×win[n]y[n] = h[L_h - L_{overlap} + n] \times w_{out}[n] + x[k_{opt} + n] \times w_{in}[n]y[n]=h[Lh−Loverlap+n]×wout[n]+x[kopt+n]×win[n]
最终输出信号:
ytotal=[h[0:Lh−Loverlap],y[0:Loverlap],x[kopt+Loverlap:Lx]]y_{total} = [h[0:L_h - L_{overlap}], y[0:L_{overlap}], x[k_{opt} + L_{overlap}:L_x]]ytotal=[h[0:Lh−Loverlap],y[0:Loverlap],x[kopt+Loverlap:Lx]]
7. 快速算法(FFT加速)
使用FFT加速互相关计算:
R[k]=F−1{F{hoverlap}⋅F{xsearch}‾}R[k] = \mathcal{F}^{-1}\{\mathcal{F}\{h_{overlap}\} \cdot \overline{\mathcal{F}\{x_{search}\}}\}R[k]=F−1{F{hoverlap}⋅F{xsearch}}
其中:
- F\mathcal{F}F 表示傅里叶变换
- ⋅‾\overline{\cdot}⋅ 表示复共轭
- hoverlaph_{overlap}hoverlap 是历史缓冲区的重叠部分
- xsearchx_{search}xsearch 是当前帧的搜索区域
8. 参数选择经验公式
重叠长度:
Loverlap=α×Lframe,α∈[0.2,0.5]L_{overlap} = \alpha \times L_{frame}, \quad \alpha \in [0.2, 0.5]Loverlap=α×Lframe,α∈[0.2,0.5]
搜索范围:
Lsearch=β×Lframe,β∈[0.1,0.3]L_{search} = \beta \times L_{frame}, \quad \beta \in [0.1, 0.3]Lsearch=β×Lframe,β∈[0.1,0.3]
帧长选择(根据采样率):
Lframe=fs×Tframe1000,Tframe∈[20,60]msL_{frame} = \frac{f_s \times T_{frame}}{1000}, \quad T_{frame} \in [20, 60] msLframe=1000fs×Tframe,Tframe∈[20,60]ms
1029

被折叠的 条评论
为什么被折叠?



