逻辑回归
逻辑回归(Logistic Regression)是用于处理分类问题的一种算法,常用于二分类的处理,当然也可以处理多分类问题。它的思想是基于线性回归,实质上是一种广义线性回归模型。
对于逻辑回归模型,最核心的部分就是引进了sigmoid函数。如下图:
通过sigmoid函数,可以将任意的输入映射到[0,1]之间,对于二分类问题,我们可以认为这样的输出值就是一个概率。
下面给出逻辑回归的数学推导过程:
hθ(x)=g(θTx)=11+e−θTx{h_\theta }(x) = g({\theta ^T}x) = \frac{1}{{1 + {e^{ - {\theta ^T}x}}}}hθ(x)=g(θTx)=1+e−θTx1
对sigmoid函数求导:
g′(x)=(11+e−x)′=e−x(1+e−x)2=11+e−xe−x1+e−x=11+e−x(1−11+e−x)=g(x)(1−g(x))\begin{array}{l}
g'\left( x \right) = {\left( {\frac{1}{{1 + {e^{ - x}}}}} \right)^\prime } = \frac{{{e^{ - x}}}}{{{{\left( {1 + {e^{ - x}}} \right)}^2}}}\\
= \frac{1}{{1 + {e^{ - x}}}}\frac{{{e^{ - x}}}}{{1 + {e^{ - x}}}} = \frac{1}{{1 + {e^{ - x}}}}\left( {1 - \frac{1}{{1 + {e^{ - x}}}}} \right)\\
= g\left( x \right)\left( {1 - g\left( x \right)} \right)
\end{array}g′(x)=(1+e−x1)′=(1+e−x)2e−x=1+e−x11+e−xe−x=1+e−x1(1−1+e−x1)=g(x)(1−g(x))
逻辑回归参数估计:
假定:P(y=1∣x;θ)=hθ(x)P(y=0∣x;θ)=1−hθ(x)\begin{array}{l}
P\left( {y = 1\left| {x;\theta } \right.} \right) = {h_\theta }(x)\\
P\left( {y = 0\left| {x;\theta } \right.} \right) = 1 - {h_\theta }(x)
\end{array}P(y=1∣x;θ)=hθ(x)P(y=0∣x;θ)=1−hθ(x)
则有:p(y∣x;θ)=(hθ(x))y(1−hθ(x))1−yp\left( {y\left| {x;\theta } \right.} \right) = {\left( {{h_\theta }(x)} \right)^y}{\left( {1 - {h_\theta }(x)} \right)^{1 - y}}p(y∣x;θ)=(hθ(x))y(1−hθ(x))1−y
似然函数:
L(θ)=p(y∣x;θ)=∏i=1mp(y(i)∣x(i);θ)=∏i=1m(hθ(x(i)))y(i)(1−hθ(x(i)))1−y(i)\begin{array}{l}
L\left( \theta \right) = p\left( {y\left| {x;\theta } \right.} \right)\\
= \prod\limits_{i = 1}^m {p\left( {{y^{\left( i \right)}}\left| {{x^{\left( i \right)}};\theta } \right.} \right)} \\
= \prod\limits_{i = 1}^m {{{\left( {{h_\theta }({x^{\left( i \right)}})} \right)}^{{y^{\left( i \right)}}}}{{\left( {1 - {h_\theta }({x^{\left( i \right)}})} \right)}^{1 - {y^{\left( i \right)}}}}}
\end{array}L(θ)=p(y∣x;θ)=i=1∏mp(y(i)∣∣x(i);θ)=i=1∏m(hθ(x(i)))y(i)(1−hθ(x(i)))1−y(i)
对数似然函数:
l(θ)=logL(θ)=∑i=1my(i)logh(x(i))+(1−y(i))log(1−h(x(i)))\begin{array}{l}
l\left( \theta \right) = \log L\left( \theta \right)\\
= \sum\limits_{i = 1}^m {{y^{\left( i \right)}}\log h\left( {{x^{\left( i \right)}}} \right) + \left( {1 - {y^{\left( i \right)}}} \right)\log \left( {1 - h\left( {{x^{\left( i \right)}}} \right)} \right)}
\end{array}l(θ)=logL(θ)=i=1∑my(i)logh(x(i))+(1−y(i))log(1−h(x(i)))
求偏导:
∂∂θjl(θ)=(y1g(θTx)−(1−y)11−g(θTx))∂∂θjg(θTx)=(y1g(θTx)−(1−y)11−g(θTx))g(θTx)(1−g(θTx))∂∂θjθTx=(y(1−g(θTx))−(1−y)g(θTx))xj=(y−hθ(x))xj\begin{array}{l}
\frac{\partial }{{\partial {\theta _j}}}l\left( \theta \right) = \left( {y\frac{1}{{g\left( {{\theta ^T}x} \right)}} - \left( {1 - y} \right)\frac{1}{{1 - g\left( {{\theta ^T}x} \right)}}} \right)\frac{\partial }{{\partial {\theta _j}}}g\left( {{\theta ^T}x} \right)\\
= \left( {y\frac{1}{{g\left( {{\theta ^T}x} \right)}} - \left( {1 - y} \right)\frac{1}{{1 - g\left( {{\theta ^T}x} \right)}}} \right)g\left( {{\theta ^T}x} \right)\left( {1 - g\left( {{\theta ^T}x} \right)} \right)\frac{\partial }{{\partial {\theta _j}}}{\theta ^T}x\\
= \left( {y\left( {1 - g\left( {{\theta ^T}x} \right)} \right) - \left( {1 - y} \right)g\left( {{\theta ^T}x} \right)} \right){x_j}\\
= \left( {y - {h_\theta }\left( x \right)} \right){x_j}
\end{array}∂θj∂l(θ)=(yg(θTx)1−(1−y)1−g(θTx)1)∂θj∂g(θTx)=(yg(θTx)1−(1−y)1−g(θTx)1)g(θTx)(1−g(θTx))∂θj∂θTx=(y(1−g(θTx))−(1−y)g(θTx))xj=(y−hθ(x))xj
参数的迭代:θj{\theta _j}θj:=KaTeX parse error: Expected '}', got 'EOF' at end of input: …left( i \right)
以上就是逻辑回归的全部推导过程,有了参数更新公式,利用梯度下降便可以求解。