AdaBoost实例

本文详细介绍如何使用AdaBoost算法训练一个强分类器。通过逐步解析算法流程,从初始化数据权值到选择最佳分类器,再到更新权值分布,最终形成一个有效的分类器。文章通过实例演示了如何计算不同阈值下的分类误差率,确定基本分类器,并更新训练数据的权值分布。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

试用AdaBoost算法学习一个强分类器

训练数据集

序号12345678910
x0123456789
y111-1-1-1111-1

解:
初始化数据权值分布
D1=(w1,1,w1,2,…,w1,10)w1,i=0.1,i=1,2,…,10 D_1=(w_{1,1},w_{1,2},\dots,w_{1,10})\\ w_{1,i}=0.1,i=1,2,\dots,10 D1=(w1,1,w1,2,,w1,10)w1,i=0.1,i=1,2,,10
对于m=1m=1m=1,
  (a)在权值分布为D1D_1D1的训练数据上,计算阈值ν\nuν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,

序号123456789
ν\nuν0.51.52.53.54.55.56.57.58.5
分类误差率0.50.40.30.40.50.40.50.40.3

阈值取ν=8.5\nu=8.5ν=8.5时分类误差率最低,故基本分类器为
G1(x)={1,x&lt;8.5−1,x≥8.5 G_1(x)= \begin{cases} 1,&amp;x\lt8.5 \\ -1,&amp;x\ge8.5 \end{cases} G1(x)={1,1,x<8.5x8.5
  (b)G1(x)G_1(x)G1(x)在训练数据集上的误差率e1=P(G1(xi)≠yi)=0.3e_1=P(G_1(x_i)\neq y_i) =0.3e1=P(G1(xi)̸=yi)=0.3
  ©计算G1(x)G_1(x)G1(x)的系数:α1=12log1−e1e1=0.4236\alpha_1=\dfrac{1}{2}log\dfrac{1-e_1}{e_1}=0.4236α1=21loge11e1=0.4236
  (d)更新训练数据的权值分布:
D2=(w2,1,w2,2,…,w2,10) D_2=(w_{2,1},w_{2,2},\dots,w_{2,10}) D2=(w2,1,w2,2,,w2,10)
w2,i=w1,iZ1exp(−α1yiG1(xi)),i=1,2,…,10 w_{2,i} = \dfrac{w_{1,i}}{Z_1}exp(-\alpha_1y_iG_1(x_i)),i=1,2,\dots,10 w2,i=Z1w1,iexp(α1yiG1(xi)),i=1,2,,10
D2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857) D_2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857) D2=(0.07142857,0.07142857,0.07142857,0.16666667,0.16666667,0.16666667,0.07142857,0.07142857,0.07142857,0.07142857)
f1(x)=α1G1(x)=0.4236G1(x) f_1(x)=\alpha_1G_1(x)=0.4236G_1(x) f1(x)=α1G1(x)=0.4236G1(x)
  (e)分类器sign[f1(x)]sign[f_1(x)]sign[f1(x)]在训练数据集上有3个误分点

序号12345678910
G1(x)G_1(x)G1(x)111111111-1
f1(x)f_1(x)f1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
sign[f1(x)]sign[f_1(x)]sign[f1(x)]111111111-1
y111-1-1-1111-1

m=2m=2m=2,
  (a)在权值分布为D2D_2D2的训练数据上,计算阈值ν\nuν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,em=∑Gm(xi)≠yiwmie_m=\sum_{G_m(x_i)\neq y_i} w_{mi}em=Gm(xi)̸=yiwmi

序号123456789
ν\nuν0.51.52.53.54.55.56.57.58.5
分类误差率0.3570.2860.2140.3810.4520.2860.3580.4290.5

阈值取ν=2.5\nu=2.5ν=2.5时分类误差率最低,故基本分类器为
G2(x)={1,x&lt;2.5−1,x≥2.5 G_2(x)= \begin{cases} 1,&amp;x\lt2.5 \\ -1,&amp;x\ge2.5 \end{cases} G2(x)={1,1,x<2.5x2.5
  (b)G2(x)G_2(x)G2(x)在训练数据集上的误差率e2=P(G2(xi)≠yi)=0.214e_2=P(G_2(x_i)\neq y_i) =0.214e2=P(G2(xi)̸=yi)=0.214
  ©计算G2(x)G_2(x)G2(x)的系数:α2=12log1−e2e2=0.6496\alpha_2=\dfrac{1}{2}log\dfrac{1-e_2}{e_2}=0.6496α2=21loge21e2=0.6496
  (d)更新训练数据的权值分布:
D3=(w3,1,w3,2,…,w3,10) D_3=(w_{3,1},w_{3,2},\dots,w_{3,10}) D3=(w3,1,w3,2,,w3,10)
w3,i=w2,iZ1exp(−α2yiG2(xi)),i=1,2,…,10 w_{3,i} = \dfrac{w_{2,i}}{Z_1}exp(-\alpha_2y_iG_2(x_i)),i=1,2,\dots,10 w3,i=Z1w2,iexp(α2yiG2(xi)),i=1,2,,10
D3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056,0.16666675,0.16666675,0.16666675,0.04545452) D_3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056, 0.16666675,0.16666675,0.16666675,0.04545452) D3=(0.04545452,0.04545452,0.04545452,0.10606056,0.10606056,0.10606056,0.16666675,0.16666675,0.16666675,0.04545452)
f2(x)=0.4236G1(x)+0.6496G2(x) f_2(x)=0.4236G_1(x) + 0.6496G_2(x) f2(x)=0.4236G1(x)+0.6496G2(x)
  (e)分类器sign[f2(x)]sign[f_2(x)]sign[f2(x)]在训练数据集上有3个误分点

序号12345678910
G1(x)G_1(x)G1(x)111111111-1
G2(x)G_2(x)G2(x)111-1-1-1-1-1-1-1
α1G1(x)\alpha_1G_1(x)α1G1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
α2G2(x)\alpha_2G_2(x)α2G2(x)0.64960.64960.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496
sign[f2(x)]sign[f_2(x)]sign[f2(x)]111-1-1-1-1-1-1-1
y111-1-1-1111-1

m=3m=3m=3
  (a)在权值分布为D3D_3D3的训练数据上,计算阈值ν\nuν取[0.5,1.5,2.5,3.5,4.5,5.5,6.5,7.5,8.5]时分类误差率,em=∑Gm(xi)≠yiwmie_m=\sum_{G_m(x_i)\neq y_i} w_{mi}em=Gm(xi)̸=yiwmi

序号123456789
ν\nuν0.51.52.53.54.55.56.57.58.5
分类误差率0.4090.4550.50.3940.2880.1820.3480.4850.318

阈值取ν=5.5\nu=5.5ν=5.5时分类误差率最低,故基本分类器为
G2(x)={−1,x&lt;5.51,x≥5.5 G_2(x)= \begin{cases} -1,&amp;x\lt5.5 \\ 1,&amp;x\ge5.5 \end{cases} G2(x)={1,1,x<5.5x5.5
  (b)G3(x)G_3(x)G3(x)在训练数据集上的误差率e3=P(G3(xi)≠yi)=0.7520e_3=P(G_3(x_i)\neq y_i) =0.7520e3=P(G3(xi)̸=yi)=0.7520
  (d)更新训练数据的权值分布:
D4=(w4,1,w4,2,…,w4,10) D_4=(w_{4,1},w_{4,2},\dots,w_{4,10}) D4=(w4,1,w4,2,,w4,10)
w4,i=w3,iZ1exp(−α3yiG3(xi)),i=1,2,…,10 w_{4,i} = \dfrac{w_{3,i}}{Z_1}exp(-\alpha_3y_iG_3(x_i)),i=1,2,\dots,10 w4,i=Z1w3,iexp(α3yiG3(xi)),i=1,2,,10
D4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478,0.10185189,0.10185189,0.10185189,0.125) D_4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478, 0.10185189,0.10185189,0.10185189,0.125) D4=(0.125,0.125,0.125,0.06481478,0.06481478,0.06481478,0.10185189,0.10185189,0.10185189,0.125)
f3(x)=0.4236G1(x)+0.6496G2(x)+0.7520G3(x) f_3(x)=0.4236G_1(x) + 0.6496G_2(x)+0.7520G_3(x) f3(x)=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)
  (e)分类器sign[f3(x)]sign[f_3(x)]sign[f3(x)]在训练数据集上有0个误分点

序号12345678910
G1(x)G_1(x)G1(x)111111111-1
G2(x)G_2(x)G2(x)111-1-1-1-1-1-1-1
G3(x)G_3(x)G3(x)-1-1-1-1-1-11111
α1G1(x)\alpha_1G_1(x)α1G1(x)0.42360.42360.42360.42360.42360.42360.42360.42360.4236-0.4236
α2G2(x)\alpha_2G_2(x)α2G2(x)0.64960.64960.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496-0.6496
α3G3(x)\alpha_3G_3(x)α3G3(x)-0.7520-0.7520-0.7520-0.7520-0.7520-0.75200.75200.75200.75200.7520
sign[f3(x)]sign[f_3(x)]sign[f3(x)]111-1-1-1111-1
y111-1-1-1111-1

于是最终的分类器为
G(x)=sign[f3(x)]=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)G(x)=sign[f_3(x)]=0.4236G_1(x) + 0.6496G_2(x)+0.7520G_3(x)G(x)=sign[f3(x)]=0.4236G1(x)+0.6496G2(x)+0.7520G3(x)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值