1.cost function
J= -1 * sum( y .* log( sigmoid(X*theta) ) + (1 - y ) .* log( (1 - sigmoid(X*theta)) ) ) / m + lambda/(2*m) * temp' * temp;
grad = ( X' * (sigmoid(X*theta) - y ) )/ m + lambda/m * temp ;
2.反向传播计算,求theta(j)的偏导
%初始化theta偏导累加值
set △(l)(ij)=0.
For i=1 to n
set a(1)=x(i)
%前向传播计算a(l)
perform forward propagation to compute a(l) for l=1,2,3,...,L
%输出层的loss
using y(i),compute delta(L)=a(L)-y(i)
%隐藏层的loss
compte delta(L-1),delta(L-2),...,delta(2)
%建立△即误差的累加
△(l)(ij):=△(l)(ij)+a(l)(j)*dleta(l+1)(i)
%计算delta的偏导矩阵
D(l)(ij)=1/m *△(l)(ij)+λ∑θ(l)(ij) ..........(j≠0)
D(l)(ij)=1/m *△(l)(ij)......................(j=0)
3.关于thea的预处理,
thetaVec=[theta1(:);theta(:);theta(:)];/在function(thetaVec)作带入
reshape(thetaVec(1:110),10,11);/在function中重构做反向传播及theta偏导的计算
4.梯度检验:优化j(θ)的,让误差更加准确
octave:gradApprox=(J(theta+EPSILON)-J(theta-EPSILON))/(2*EPSILON)
for i =1:n
thetaPlus =theta;
thetaPlus(i) =thetaplus(i)+ EPSILON;
thetaMinus= theta;
thetaMinus(i)=thetaMinus(i)-EPSILON;
gradApprox=(J(theta+EPSILON)-J(theta-EPSILON))/(2*EPSILON);
end;
J= -1 * sum( y .* log( sigmoid(X*theta) ) + (1 - y ) .* log( (1 - sigmoid(X*theta)) ) ) / m + lambda/(2*m) * temp' * temp;
grad = ( X' * (sigmoid(X*theta) - y ) )/ m + lambda/m * temp ;
2.反向传播计算,求theta(j)的偏导
%初始化theta偏导累加值
set △(l)(ij)=0.
For i=1 to n
set a(1)=x(i)
%前向传播计算a(l)
perform forward propagation to compute a(l) for l=1,2,3,...,L
%输出层的loss
using y(i),compute delta(L)=a(L)-y(i)
%隐藏层的loss
compte delta(L-1),delta(L-2),...,delta(2)
%建立△即误差的累加
△(l)(ij):=△(l)(ij)+a(l)(j)*dleta(l+1)(i)
%计算delta的偏导矩阵
D(l)(ij)=1/m *△(l)(ij)+λ∑θ(l)(ij) ..........(j≠0)
D(l)(ij)=1/m *△(l)(ij)......................(j=0)
3.关于thea的预处理,
thetaVec=[theta1(:);theta(:);theta(:)];/在function(thetaVec)作带入
reshape(thetaVec(1:110),10,11);/在function中重构做反向传播及theta偏导的计算
4.梯度检验:优化j(θ)的,让误差更加准确
octave:gradApprox=(J(theta+EPSILON)-J(theta-EPSILON))/(2*EPSILON)
for i =1:n
thetaPlus =theta;
thetaPlus(i) =thetaplus(i)+ EPSILON;
thetaMinus= theta;
thetaMinus(i)=thetaMinus(i)-EPSILON;
gradApprox=(J(theta+EPSILON)-J(theta-EPSILON))/(2*EPSILON);
end;
Check that gradApprox~Dvec(back-prop的梯度) ------if 1 correct costsfunction/Dvec