一、GBDT公式推导
1、第一个基函数:
F0(X)=12log1+y‾1−y‾(1.1)F_0(X)=\frac{1}{2}log\frac{1+\overline{y}}{1-\overline{y}} \tag{1.1}F0(X)=21log1−y1+y(1.1)
对于损失函数L(y,F)=log(1+e−2yF),y∈{
−1,1}L(y, F)=log(1+e^{-2yF}), y\in \left\{-1, 1\right\}L(y,F)=log(1+e−2yF),y∈{
−1,1},求损失函数最小对应的FFF值。求一阶导数:
L′=−2ye−2yF1+e−2yF(1.2) L^{'}=\frac{-2ye^{-2yF}}{1+e^{-2yF}} \tag{1.2}L′=1+e−2yF−2ye−2yF(1.2)
设
{
count(y=+1)=m1count(y=−1)=m2(1.3) \begin{cases} &count(y=+1)=m1\\ &count(y=-1)=m2 \end{cases} \tag{1.3} {
count(y=+1)=m1count(y=−1)=m2(1.3)
则有
{
m1+m2=nm1−m2n=y‾(1.4) \begin{cases} m1+m2=n\\ \frac{m1-m2}{n}=\overline{y} \end{cases} \tag{1.4} {
m1+m2=nnm1−m2=y(1.4)
即
{
m1=n2(1+y‾)m2=n2(1−y‾)(1.5) \begin{cases} m1=\frac{n}{2}(1+\overline{y})\\ m2=\frac{n}{2}(1-\overline{y}) \end{cases} \tag{1.5} {
m1=2n(1+y)m2=2n(1−y)(1.5)
令∑L′=0\sum{L^{'}=0}∑L′=0得到,
∑L′=∑y=1L′+∑y=−1L′=0(1.6)\sum{L^{'}=\sum_{y=1}L^{'}+\sum_{y=-1}L^{'}=0} \tag{1.6}∑L′=y=1∑L′+y=−1∑L′=0(1.6)
将(1.5)(1.5)(1.5)带入(1.6)(1.6)(1.6),得到,
L′=∑y=1L′+∑y=−1L′=n2(1+y‾)∗−2e−2F1+e−2F+n2(1−y‾)∗2e2F1+e2F=n2(1+y‾)∗−21+e2F+n2(1−y‾)∗2e2F1+e2F=n1+e2F[−(1+y‾)+e2F(1−y‾)](1.7) \begin{aligned} L^{'}&=\sum_{y=1}L^{'}+\sum_{y=-1}L^{'}\\\\ &=\frac{n}{2}(1+\overline{y})*\frac{-2e^{-2F}}{1+e^{-2F}} + \frac{n}{2}(1-\overline{y})*\frac{2e^{2F}}{1+e^{2F}}\\\\ &=\frac{n}{2}(1+\overline{y})*\frac{-2}{1+e^{2F}} + \frac{n}{2}(1-\overline{y})*\frac{2e^{2F}}{1+e^{2F}}\\\\ &=\frac{n}{1+e^{2F}}[-(1+\overline{y}) + e^{2F}(1-\overline{y})] \end{aligned} \tag{1.7} L′=y=1∑L′+y=−1∑L′=2n(1+y)∗1+e−2F−2e−2F+2n(1−y)∗1+e2F2e2F=2n(1+y)∗1+e2F−2+2n(1−y)∗1+e2F2e2F=1+e2Fn[−(1+y)+e2F(1−y)](1.7)
由式(1.7)=0(1.7)=0(1.7)=0解得F0=12ln1+y‾1−y‾F_0=\frac{1}{2}ln\frac{1+\overline{y}}{1-\overline{y}}F0=