Coursera-Machine Learning-ex4

本文详细介绍了神经网络的训练过程,包括选择网络架构、初始化权重、前向传播、计算成本函数、反向传播、梯度检查及优化算法应用等步骤。通过具体函数实现,如前向传播、成本计算和梯度更新,展示了神经网络训练的完整流程。

Some Points:

Implement Steps:

1.Pick a neural network  architecture.

  • number of input units = dimension of features
  • number of output units = number of classes
  • number of hidden layer = 1(Default), if you have more than 1 hidden layer, it recommended that every hidden layer have the same number of units
  • number of hidden units per layer = .. Usually the more the better

2. Randomly initialize the weights so that they are not competely identical. Formally says "break symmetry"

3. Implement a for-loop for t = 1: m and place the steps 3.1 - 3.3 below inside the for-loop. With the t-th iteration performing the calculation on the t-th trainng example (x(t), y(t)).

  • 3.1 Implement forward propagation to get vector a(unit) and vector z(unit) for each layer.

  • 3.2 Implement cost function for the t-th example.

  • 3.3 Implement back propagation to get vector \delta(unit) and matrix \Delta(weight).

4. Get the averages of  cost function and matrix \Delta,  add regurlarization portion.Attention the index begin at 1,  ignoring the bias units.

5.Use gradient checking to confirm that your backprpa works well.

6. Use  gradient descent or other advanced optimization algorithms e.g. fmincg to get the mininum cost-point.

7. In my eyes, decrease the value of \lambda may weaken the effect of regularization, which will improve the accuracy of training set detection and is harmful to the prediction of new data.

The Results:

  • nnCostFunction.m
function [J grad] = nnCostFunction(nn_params, ...
                                   input_layer_size, ...
                                   hidden_layer_size, ...
                                   num_labels, ...
                                   X, y, lambda)
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
                 hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
                 num_labels, (hidden_layer_size + 1));

% Setup some useful variables
m = size(X, 1);
         
% You need to return the following variables correctly 
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));

X = [ones(m, 1), X]; 					% 5000*401
a2 = sigmoid(X * Theta1');				% 5000*25
a2 = [ones(size(a2,1),1), a2];			% 5000*26
h = sigmoid(a2 * Theta2'); 				% 5000*10

%for i=1:num_labels
%	pos = log(h(:, i));					% 5000*1
%	neg = log(1-h(:, i));				% 5000*1
%	y_temp = y==i;
%	J = J + (-1 / m) * (y_temp' * pos + (1 - y_temp)' * neg);
%end

for i=1:m
	pos = log(h(i, :));					%1*10
	neg = log(1-(h(i, :)));				%1*10
	y_temp = zeros(num_labels, 1);		%10*1
	y_temp(y(i,1), 1) = 1;				%10*1
	J = J + (pos * y_temp + neg * (1 - y_temp));
end
J =  (-1 / m) * J;
% -------------------------------------------------------------
y_temp = [];
for i=1:num_labels
	y_temp = [y_temp, y==i];										% 5000*10
end
for i=1:m
	delta3 = (h(i, :) - y_temp(i, :))';								% 10*1
	delta2 = Theta2' * delta3 .* a2(i, :)' .* (1- a2(i, :))'; 		% 26*1
	Theta2_grad = Theta2_grad + delta3 * a2(i, :);					% 10*26
	Theta1_grad = Theta1_grad + delta2(2:end, 1) * X(i, :);			% 25*401
end
Theta2_grad = Theta2_grad / m;										% 10*26
Theta1_grad = Theta1_grad / m;										% 25*401
% =========================================================================
sum1 = sum(sum(Theta1(:,2:end) .^ 2));
sum2 = sum(sum(Theta2(:,2:end) .^ 2));
J = J + (lambda / (2 * m)) * (sum1 + sum2);
Theta2_grad = Theta2_grad + lambda / m * [zeros(size(Theta2,1),1), Theta2(:, 2:end)];		% 10*26
Theta1_grad = Theta1_grad + lambda / m * [zeros(size(Theta1,1),1), Theta1(:, 2:end)];		% 25*401

% Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)];
end
  • sigmoidGradient.m
function g = sigmoidGradient(z)

g = zeros(size(z));

g = sigmoid(z) .* (1 - sigmoid(z));

end

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值