Logistic回归的代价函数
J(θ)=−1m[∑i=1my(i)log hθ(x(i))+(1−y(i))log(1−hθ(x(i)))]J(\theta)=-\frac{1}{m}\left[\sum^m_{i=1}y^{(i)}log\ h_\theta(x^{(i)})+(1-y^{(i)})log(1-h_\theta(x^{(i)}))\right]J(θ)=−m1[i=1∑my(i)log hθ(x(i))+(1−y(i))log(1−hθ(x(i)))]对代价函数求偏导:
补充:
hθ(x(i))=g(x(i)θ)g(x)=11+e−xg′(x)=g(x)(1−g(x))∂hθ(x(i))∂θ=hθ′(x(i))(x(i))T\begin{aligned}
&h_\theta(x^{(i)})=g(x^{(i)}\theta)\\&g(x)=\frac{1}{1+e^{-x}}\\
&g^{'}(x)=g(x)(1-g(x))\\
&\frac{\partial h_\theta(x^{(i)})}{\partial \theta}=h^{'}_\theta(x^{(i)})(x^{(i)})^T
\end{aligned}
hθ(x(i))=g(x(i)θ)g(x)=1+e−x1g′(x)=g(x)(1−g(x))∂θ∂hθ(x(i))=hθ′(x(i))(x(i))T
∂J(θ)∂θ=−1m[∑i=1my(i)hθ(x(i))−(1−y(i))log(1−hθ(x(i)))]∂hθ(x(i))∂θ=−1m[∑i=1my(i)hθ(x(i))−(1−y(i))log(1−hθ(x(i)))]hθ′(x(i))(x(i))T=1m∑i=1mhθ′(x(i))(x(i))Thθ(x(i))(1−hθ(x(i)))(hθ(x(i))−y(i))=1m∑i=1m(hθ(x(i))−y(i))(x(i))T=1mXT(hθ(X)−y)\begin{aligned}
\frac{\partial J(\theta)}{\partial \theta}
&=-\frac{1}{m}\left[\sum^m_{i=1}\frac{y^{(i)}}{h_\theta(x^{(i)})}-\frac{(1-y^{(i)})}{log(1-h_\theta(x^{(i)}))}\right]\frac{\partial h_\theta(x^{(i)})}{\partial \theta}\\
&=-\frac{1}{m}\left[\sum^m_{i=1}\frac{y^{(i)}}{h_\theta(x^{(i)})}-\frac{(1-y^{(i)})}{log(1-h_\theta(x^{(i)}))}\right]h^{'}_\theta(x^{(i)})(x^{(i)})^T
\\&=\frac{1}{m}\sum^m_{i=1}\frac{h^{'}_\theta(x^{(i)})(x^{(i)})^T}{h_\theta(x^{(i)})(1-h_\theta(x^{(i)}))}(h_\theta(x^{(i)})-y^{(i)})\\
&=\frac{1}{m}\sum^m_{i=1}(h_\theta(x^{(i)})-y^{(i)})(x^{(i)})^T\\
&=\frac{1}{m}X^T(h_\theta(X)-y)
\end{aligned}
∂θ∂J(θ)=−m1[i=1∑mhθ(x(i))y(i)−log(1−hθ(x(i)))(1−y(i))]∂θ∂hθ(x(i))=−m1[i=1∑mhθ(x(i))y(i)−log(1−hθ(x(i)))(1−y(i))]hθ′(x(i))(x(i))T=m1i=1∑mhθ(x(i))(1−hθ(x(i)))hθ′(x(i))(x(i))T(hθ(x(i))−y(i))=m1i=1∑m(hθ(x(i))−y(i))(x(i))T=m1XT(hθ(X)−y)请注意,x(i)x^{(i)}x(i)是数据条,是行向量;hθ(X)−yh_\theta(X)-yhθ(X)−y是列向量;θ\thetaθ是关于各属性权值的列向量。
利用Gradient Decent求解最优值θ\thetaθ
Repeat{θj:=θj−α∑i=1m(hθ(x(i))−y(i))xj(i)}simultaneously update all θj\begin{aligned}
Repeat\{&\\
&\theta_j:=\theta_j-\alpha\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x^{(i)}_j\\
&\}simultaneously\ update\ all\ \theta_j
\end{aligned}Repeat{θj:=θj−αi=1∑m(hθ(x(i))−y(i))xj(i)}simultaneously update all θj
你妈,被这些鬼东西搞了一下午,fuck!