逻辑回归

假设函数表示

y ∈ {0, 1},因变量y只有0,1两种取值,
为此改变假设函数的形式,使假设函数hθ(x)hθ(x)满足0hθ(x)10≤hθ(x)≤1

hθ(x)=g(θTx)z=θTxg(z)=11+ezhθ(x)=g(θTx)z=θTxg(z)=11+e−z

得到假设函数:

hθ(x)=11+eθTxhθ(x)=11+e−θTx

称为逻辑函数(Logistic Function)或者S型函数(Sigmoid Function)
逻辑函数
对于样本x,hθ(x)hθ(x)给出输出值为1的概率,即P(y=1|x;θ)P(y=1|x;θ)

决策边界

为了得到离散的0和1的两个分类,我们将假设函数做以下转化

hθ(x)0.5y=1hθ(x)<0.5y=0hθ(x)≥0.5→y=1hθ(x)<0.5→y=0

即当假设函数值大于等于0.5时,预测y=1;小于0.5时,预测y=0.

hθ(x)=g(θTx)0.5whenθTx0hθ(x)=g(θTx)≥0.5whenθTx≥0
,所以
θTx0y=1θTx<0y=0θTx≥0⇒y=1θTx<0⇒y=0

决策边界就是区分预测y=1的区域和y=0的区域的曲线,它是假设函数的属性,与数据集无关。
曲线θTx=0θTx=0即决策边界。

代价函数

逻辑回归的代价函数:

J(θ)=1mi=1mCost(hθ(x(i)),y(i))Cost(hθ(x),y)=log(hθ(x))Cost(hθ(x),y)=log(1hθ(x))if y = 1if y = 0J(θ)=1m∑i=1mCost(hθ(x(i)),y(i))Cost(hθ(x),y)=−log⁡(hθ(x))if y = 1Cost(hθ(x),y)=−log⁡(1−hθ(x))if y = 0

y=1, 这里写图片描述,y=0,这里写图片描述

Cost(hθ(x),y)=0 if hθ(x)=yCost(hθ(x),y) if y=0andhθ(x)1Cost(hθ(x),y) if y=1andhθ(x)0Cost(hθ(x),y)=0 if hθ(x)=yCost(hθ(x),y)→∞ if y=0andhθ(x)→1Cost(hθ(x),y)→∞ if y=1andhθ(x)→0

Cost(hθ(x),y)=ylog(hθ(x))(1y)log(1hθ(x))Cost(hθ(x),y)=−ylog⁡(hθ(x))−(1−y)log⁡(1−hθ(x))

完整的代价函数
J(θ)=1mi=1m[y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i)))]J(θ)=−1m∑i=1m[y(i)log⁡(hθ(x(i)))+(1−y(i))log⁡(1−hθ(x(i)))]

向量表示
h=g(Xθ)J(θ)=1m(yTlog(h)(1y)Tlog(1h))h=g(Xθ)J(θ)=1m⋅(−yTlog⁡(h)−(1−y)Tlog⁡(1−h))

梯度下降

Repeat{θj:=θjαθjJ(θ)}Repeat{θj:=θj−α∂∂θjJ(θ)}

Repeat{θj:=θjαmi=1m(hθ(x(i))y(i))x(i)j}Repeat{θj:=θj−αm∑i=1m(hθ(x(i))−y(i))xj(i)}

向量运算实现:θ:=θαmXT(g(Xθ)y⃗ )θ:=θ−αmXT(g(Xθ)−y→)
梯度运算的推导:

θjJ(θ)=1mi=1m(y(i)1hθ(x(i))hθ(x(i))θj(1y(i))11hθ(x(i))hθ(x(i))θj)=1mi=1m(y(i)1g(θTx(i))(1y(i))11g(θTx(i)))g(θTx(i))θj=1mi=1m(y(i)1g(θTx(i))(1y(i))11g(θTx(i)))g(θTx(i))(1g(θTx(i)))x(i)j=1mi=1m(y(i)(1g(θTx(i)))(1y(i))g(θTx(i)))x(i)j=1mi=1m(y(i)g(θTx(i)))x(i)j=1mi=1m(hθ(x(i))y(i))x(i)j∂∂θjJ(θ)=−1m∑i=1m(y(i)1hθ(x(i))∂hθ(x(i))∂θj−(1−y(i))11−hθ(x(i))∂hθ(x(i))∂θj)=−1m∑i=1m(y(i)1g(θTx(i))−(1−y(i))11−g(θTx(i)))∂g(θTx(i))∂θj=−1m∑i=1m(y(i)1g(θTx(i))−(1−y(i))11−g(θTx(i)))g(θTx(i))(1−g(θTx(i)))xj(i)=−1m∑i=1m(y(i)(1−g(θTx(i)))−(1−y(i))g(θTx(i)))xj(i)=−1m∑i=1m(y(i)−g(θTx(i)))xj(i)=1m∑i=1m(hθ(x(i))−y(i))xj(i)

2到3步:
g(θTx(i))θj=ddz(11+ez)(θTx(i))θjz=θTx(i)=ez(1+ez)2(θ0+θ1x1++θjxj++θmxm)θj=ez+11(1+ez)2x(i)j=[11+ez(11+ez)2]x(i)j=(g(z)g2(z))x(i)j=g(θTx(i))(1g(θTx(i)))x(i)j∂g(θTx(i))∂θj=ddz(11+e−z)∂(θTx(i))∂θj⋯令z=θTx(i)=e−z(1+e−z)2∂(θ0+θ1x1+⋯+θjxj+⋯+θmxm)∂θj=e−z+1−1(1+e−z)2xj(i)=[11+e−z−(11+e−z)2]xj(i)=(g(z)−g2(z))xj(i)=g(θTx(i))(1−g(θTx(i)))xj(i)

多类别分类

这里写图片描述

选择一个类别,将其余的类别都划为第二类,由此得到一个分类器,以此类推,对n个类别获得n个分类器

y{0,1...n}h(0)θ(x)=P(y=0|x;θ)h(1)θ(x)=P(y=1|x;θ)h(n)θ(x)=P(y=n|x;θ)prediction=maxi(h(i)θ(x))y∈{0,1...n}hθ(0)(x)=P(y=0|x;θ)hθ(1)(x)=P(y=1|x;θ)⋯hθ(n)(x)=P(y=n|x;θ)prediction=maxi(hθ(i)(x))
预测值取各个分类器结果中最大值,即为预测结果。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值