定理(AdaBoost的训练误差界): AdaBoost算法最终分类器的训练误差界为:
1N∑i=1NI(G(xi)≠yi)≤1N∑i=1Nexp(−yif(xi))=∏mZm
\begin{aligned}
\frac{1}{N}\sum_{i=1}^N \mathtt{I} (G(x_i) \neq y_i) \leq \frac{1}{N}\sum_{i=1}^N\mathtt{exp}(-y_i f(x_i))=\prod_{m}Z_m
\end{aligned}
N1i=1∑NI(G(xi)=yi)≤N1i=1∑Nexp(−yif(xi))=m∏Zm
这里的G(x),f(x),ZmG(x),f(x),Z_mG(x),f(x),Zm,在统计学习方法的中定义。
Proof:
这其中:G(x)=f(x)=∑mαmGm(x)G(x)=f(x)=\sum_{m}\alpha_m G_m(x)G(x)=f(x)=∑mαmGm(x),都表示由AdaBoost方法得到的最终分类器。Zm=∑i=1Nwmiexp(−αmyiGm(xi))Z_m=\sum_{i=1}^N w_{mi} \mathtt{exp}(-\alpha_m y_i G_m(x_i))Zm=∑i=1Nwmiexp(−αmyiGm(xi)),表示第m+1m+1m+1个弱分类器的数值分布的归一化因子。这其中:wmi=wm−1iZm−1exp(−αm−1yiGm−1(xi))w_{mi}= \frac{w_{m-1i}}{Z_{m-1}} \mathtt{exp}(-\alpha_{m-1} y_i G_{m-1}(x_i))wmi=Zm−1wm−1iexp(−αm−1yiGm−1(xi))表示第mmm分类器的数据分布中第iii个数值的分布值;αm=12log1−emem\alpha_m = \frac{1}{2}\mathtt{log}\frac{1-e_m}{e_m}αm=21logem1−em,表示第mmm个弱分类器的系数,其中em=∑i=1NP(Gm(xi)≠yi)=∑i=1NwmiI(Gm(xi)≠yi)e_m = \sum_{i=1}^N \mathbb{P}(G_m(x_i) \neq y_i)=\sum_{i=1}^N w_{mi} \mathtt{I}(G_m(x_i) \neq y_i)em=∑i=1NP(Gm(xi)=yi)=∑i=1NwmiI(Gm(xi)=yi)表示分类错误率。
此时,我们看上面的定理,他是用所有的归一化因子来作为分类误差的上界。
首先:
G(xi)≠y(xi)→y(xi)f(xi)<0→exp(y(xi)f(xi))<1→exp(−y(xi)f(xi))>1≥I(G(xi)≠f(xi)G(x_i) \neq y(x_i) \to y(x_i)f(x_i) < 0 \to \mathtt{exp}(y(x_i)f(x_i)) < 1 \to \mathtt{exp}(-y(x_i)f(x_i)) > 1 \geq \mathtt{I}(G(x_i) \ne f(x_i)G(xi)=y(xi)→y(xi)f(xi)<0→exp(y(xi)f(xi))<1→exp(−y(xi)f(xi))>1≥I(G(xi)=f(xi).
那么就可以得到:
1N∑i=1NI(G(xi)≠yi)≤1N∑i=1Nexp(−yif(xi))\frac{1}{N}\sum_{i=1}^N \mathtt{I} (G(x_i) \neq y_i) \leq \frac{1}{N}\sum_{i=1}^N\mathtt{exp}(-y_i f(x_i))N1∑i=1NI(G(xi)=yi)≤N1∑i=1Nexp(−yif(xi))。
下面证明定理右边的等式成立:
1N∑i=1Nexp(−yif(xi))=1N∑i=1Nexp(−yi∑m=1MαmGm(xi))=1N∑i=1Nexp(∑m=1MyiαmGm(xi))=1N∑i=1N∏m=1Mexp(yiαmGm(xi))
\begin{aligned}
\frac{1}{N}\sum_{i=1}^N\mathtt{exp}(-y_i f(x_i)) &=\frac{1}{N}\sum_{i=1}^N \mathtt{exp}(-y_i \sum_{m=1}^M \alpha_m G_m(x_i)) \\
& = \frac{1}{N}\sum_{i=1}^N\mathtt{exp}(\sum_{m=1}^M y_i \alpha_m G_m(x_i)) \\
& = \frac{1}{N}\sum_{i=1}^N \prod_{m=1}^M \mathtt{exp}(y_i \alpha_m G_m(x_i))
\end{aligned}
N1i=1∑Nexp(−yif(xi))=N1i=1∑Nexp(−yim=1∑MαmGm(xi))=N1i=1∑Nexp(m=1∑MyiαmGm(xi))=N1i=1∑Nm=1∏Mexp(yiαmGm(xi))
由上述式子,可知 wm+1iZm=wmiexp(−αmyiGm(xi))w_{m+1i}Z_m = w_{mi} \mathtt{exp}(-\alpha_m y_i G_m(x_i))wm+1iZm=wmiexp(−αmyiGm(xi)),并且在Adaboost中 ∑iwmi=1\sum_i w_{mi}=1∑iwmi=1。则有:
1N∑i=1Nexp(−yif(xi))=1N∑i=1Nw1i∏m=1Mexp(yiαmGm(xi))=Z11N∑i=1Nw2i∏m=2Mexp(yiαmGm(xi))=⋯=1NZ1Z2⋯ZM∑i=1m1=1NZ1Z2⋯ZMN=∏m=1MZm
\begin{aligned}
\frac{1}{N}\sum_{i=1}^N\mathtt{exp}(-y_i f(x_i))
& = \frac{1}{N}\sum_{i=1}^N w_{1i} \prod_{m=1}^M \mathtt{exp}(y_i \alpha_m G_m(x_i))\\
& = Z_1 \frac{1}{N}\sum_{i=1}^N w_{2i} \prod_{m=2}^M \mathtt{exp}(y_i \alpha_m G_m(x_i))\\
& = \cdots\\
& = \frac{1}{N}Z_1Z_2\cdots Z_M\sum_{i=1}^m 1\\
& = \frac{1}{N}Z_1Z_2\cdots Z_MN\\
& = \prod_{m=1}^MZ_m
\end{aligned}
N1i=1∑Nexp(−yif(xi))=N1i=1∑Nw1im=1∏Mexp(yiαmGm(xi))=Z1N1i=1∑Nw2im=2∏Mexp(yiαmGm(xi))=⋯=N1Z1Z2⋯ZMi=1∑m1=N1Z1Z2⋯ZMN=m=1∏MZm
综上,定理得证。