Softmax
题目:
softmax:
实际上是有限项离散概率分布的梯度对数归一化,是LR在多分类上的推广:
P(y=i∣x;θ)=eθiTx∑j=1KeθjTxP(y=i|x;\theta)=\frac{e^{{\theta}^T_ix}}{\sum^K_{j=1}e^{{\theta}^T_jx}}P(y=i∣x;θ)=∑j=1KeθjTxeθiTx
损失函数:
l(θ)=−1m[∑i=1m∑j=1K1{y(i)=j}logeθiTx∑j=1KeθjTx]l(\theta)=-\frac{1}{m}[\sum^m_{i=1}\sum^{K}_{j=1}1\{{y^{(i)}=j}\}log\frac{e^{{\theta}^T_ix}}{\sum^K_{j=1}e^{{\theta}^T_jx}}]l(θ)=−m1[i=1∑mj=1∑K1{y(i)=j}log∑j=1KeθjTxeθiTx]
m个样本,K个类别。
1{y(i)=j}1\{{y^{(i)}=j}\}1{y(i)=j}表示当y(i)y^{(i)}y(i)样本类别等于j时,取1.
证明:
softmax(x+c)i=exp(xi+c)∑j=1dimension(x)exp(xj+c)softmax(x+c)_i=\frac{exp(x_i+c)}{\sum^{dimension(x)}_{j=1}exp(x_j+c)}softmax(x+c)i=∑j=1dimension(x)exp(xj+c)exp(xi+c)
=exp(xi)exp(c)exp(c)∑j=1dimension(x)exp(xj)=\frac{exp(x_i)exp(c)}{exp(c)\sum^{dimension(x)}_{j=1}exp(x_j)}=exp(c)∑j=1dimension(x)exp(xj)exp(xi)exp(c)
=exp(xi)∑j=1dimension(x)exp(xj)=\frac{exp(x_i)}{\sum^{dimension(x)}_{j=1}exp(x_j)}=∑j=1dimension(x)exp(xj)exp(xi)
=softmax(x)=softmax(x)=softmax(x)
这个等式说明了,向量的偏移(即+c+c+c)不影响softmax的输出。
code:
import numpy as np
def softmax(x):
assert len(x.shape)>1 #x的维度一定要大于1
x=x-np.max(x,axis=1,keepdims=True)
x=np.exp(x)/np.sum(np.exp(x),axis=1,keepdims=True)
return x
if __name__=='__main__':
matrix=np.arange(0,30,2)
matrix=matrix.reshape(3,5)
#for i in range(len(matrix)):
print(softmax(matrix))