逻辑回归

haidao's blog

logistic-regression.ipynb

公式写的不错 logistic regression classifier

机器学习算法与Python实践之(七)逻辑回归(Logistic Regression)

Cost function的推导
sigmoid函数

\(h(\theta) = \frac{1}{1+ e^{-\theta*x}}\)
\(p(y = 1|x;\theta) = h_{\theta}(x)\)

\(p(y = 0|x;\theta) = 1- h_{\theta}(x)\)
紧凑写
\(p(y| x; \theta) = h_{\theta}(x)^y * (1 - h_{\theta}(x)) ^ {1 - y}\)

似然函数
\(L(\theta) = \prod p(y^i | \theta; x^i)\)

\(log(L) = \sum_{i=1}^{m} y^{i} log(h(\theta)) + (1 - y^i) log (1 - h(\theta))\)

\(\sum_{i=1}^{m} y^i log \frac{h(\theta)}{1 - h(\theta)} + log (1 - h(\theta))\)

\(log \frac{h(\theta)}{1-h(\theta)} = \theta *x\)

\(\sum_{i=1}^{m} (y^i * \theta * x + log (\frac{1}{1 + e^{\theta*x}}))\)

偏导数
\(y^{i} x^{i} - \frac{e^{\theta*x} * x^{i}}{1+e^{\theta*x}}\)

\(h(\theta) = \frac{1}{1+ e^{-w*x}} = \frac{e^{\theta*x}}{1+e^{\theta*x}}\)

所以偏导数可以写成
\((y^{i} - h_{\theta}(x)) * x^{i}\)

因为linear regression中用gradient descent求极小值, 而此处的log似然函数要求极大值, 所以把cost function至反, 从而求极小值.

所以最终
\(J(\theta) = \frac{1}{m}\sum_{i=1}^{m}\lbrack-y^{(i)}log(h_\theta(x^{(i)})) - (1-y^{(i)})log(1-h_\theta(x^{(i)}))\rbrack\)

gradient
\(\frac{\partial J(\theta)}{\partial \theta_j} = \frac{1}{m}\sum_{i=1}^{m}\left(h_\theta(x^{(i)})-y^{(i)}\right)x_j^{(i)}\)

不同版本形式风格的推导
\(\frac{\partial}{\partial \theta_i} J(\vec{\theta}) = -\frac{\partial}{\partial \theta_i} \frac{1}{m}\sum_{i=1}^{m}(y_i\log h_\vec{\theta}(\vec{x}_{i}) + (1-y_i)\log(1 - h_\vec{\theta}(\vec{x}_{i})) ) \\ = -\frac{1}{m}\sum_{i=1}^{m}(y_i\frac{\partial}{\partial \theta_i}\log h_\vec{\theta}(\vec{x}_{i}) + (1-y_i)\frac{\partial}{\partial \theta_i}\log(1 - h_\vec{\theta}(\vec{x}_{i})) ) \\ = -\frac{1}{m}\sum_{i=1}^{m}(y_i\frac{\frac{\partial}{\partial \theta_i} h_\vec{\theta}(\vec{x}_{i})}{h_\vec{\theta}(\vec{x}_{i})} + (1-y_i)\frac{\frac{\partial}{\partial \theta_i}(1 - h_\vec{\theta}(\vec{x}_{i}))}{1 - h_\vec{\theta}(\vec{x}_{i})})\)


\(\frac{\partial}{\partial \theta_j} h_\vec{\theta}(\vec{x}_{j}) = \frac{\partial}{\partial \theta_j} \frac{1}{1 + \mathbb{e}^{-\vec{\theta}^T \vec{x}}} \\ = \frac{\partial}{\partial \theta_j} \frac{1}{1 + \mathbb{e}^{-(\theta_{j,0} + \theta_1 x_{j,1} + \theta_2 x_{j,2} + \cdot + \theta_n x_{j,n})}} \\ = x_j e^{\vec{\theta}^T \vec{x}} h_\vec{\theta}(\vec{x}_{j})^2\)

最终推导

\(\frac{\partial}{\partial \theta_j} J(\vec{\theta}) = -\frac{1}{m}\sum_{i=1}^{m}(y_i\frac{x_{i,j} e^{\vec{\theta}^T \vec{x}} h_\vec{\theta}(\vec{x}_{i})^2}{h_\vec{\theta}(\vec{x}_{i})} + (1-y_i)\frac{-x_{i,j} e^{\vec{\theta}^T \vec{x}} h_\vec{\theta}(\vec{x}_{i})^2}{1 - h_\vec{\theta}(\vec{x}_{i})}) \\ = -\frac{1}{m}\sum_{i=1}^{m}(y_i x_{i,j} e^{\vec{\theta}^T \vec{x}} h_\vec{\theta}(\vec{x}_{i}) + (1-y_i)\frac{-x_{i,j} e^{\vec{\theta}^T \vec{x}} h_\vec{\theta}(\vec{x}_{i})^2}{1 - h_\vec{\theta}(\vec{x}_{i})}) \\ = -\frac{1}{m}\sum_{i=1}^{m}(x_{i,j} e^{\vec{\theta}^T \vec{x}})(y_i h_\vec{\theta}(\vec{x}_{i}) - (1-y_i)\frac{h_\vec{\theta}(\vec{x}_{i})^2}{1 - h_\vec{\theta}(\vec{x}_{i})}) \\ = -\frac{1}{m}\sum_{i=1}^{m}(x_{i,j} e^{\vec{\theta}^T \vec{x}})\frac{y_i h_\vec{\theta}(\vec{x}_{i}) - y_i h_\vec{\theta}(\vec{x}_{i})^2 + y_i h_\vec{\theta}(\vec{x}_{i})^2 - h_\vec{\theta}(\vec{x}_{i})^2}{1 - h_\vec{\theta}(\vec{x}_{i})} \\ = -\frac{1}{m}\sum_{i=1}^{m}(x_{i,j} e^{\vec{\theta}^T \vec{x}})\frac{y_i h_\vec{\theta}(\vec{x}_{i}) - h_\vec{\theta}(\vec{x}_{i})^2}{1 - h_\vec{\theta}(\vec{x}_{i})} \\ = -\frac{1}{m}\sum_{i=1}^{m}(x_{i,j} e^{\vec{\theta}^T \vec{x}})\frac{y_i - h_\vec{\theta}(\vec{x}_{i})}{\frac{1}{h_\vec{\theta}(\vec{x}_{i})} - 1} \\ = -\frac{1}{m}\sum_{i=1}^{m}(x_{i,j} e^{\vec{\theta}^T \vec{x}})\frac{y_i - h_\vec{\theta}(\vec{x}_{i})}{e^{\vec{\theta}^T \vec{x}} + 1 - 1} \\ = \frac{1}{m} \sum_{i=1}^{m} x_{i,j} (h_\vec{\theta}(\vec{x}_{i}) - y_i)\)

转载于:https://www.cnblogs.com/qietingfengyu/p/4026875.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值