机器学习-coursera-exercise3-神经网络

最新推荐文章于 2021-10-19 21:56:17 发布

原创

最新推荐文章于 2021-10-19 21:56:17 发布 · 527 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#机器学习 #神经网络 #逻辑回归 #第四周练习题 #exercise3

本文详细介绍了使用逻辑回归和神经网络识别手写数字的方法。首先，通过逻辑回归进行多元分类，包括数据加载、可视化及代价函数与梯度的向量化实现。接着，介绍了如何用一对多策略训练多个逻辑回归分类器。最后，探讨了神经网络的结构，包括前馈传播和预测，重点在预测阶段如何得到最终的分类结果。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

使用逻辑回归和神经网络来识别手写数字（0-9）。

一、多类别的分类器（多元分类的逻辑回归）

第一部分的学习，你只需要扩展你之前写的逻辑回归的函数并且将其运用到一对多的分类里面。来实现识别0-9。

（1）获取数据

与之前不同，我们存放的mat格式，而不是txt格式，mat格式表示文件里面的数据已经是MATLAB的矩阵格式，而不是文本的ASCII码格式，这样的文件可以直接被load进去你的函数里面，load之后，矩阵的维数和值将出现在你程序的内存中，且矩阵已经被命名，你不需要再给他们命名。

%load saved matrices from file
load('ex3data1.mat');
%The matrices X and y will now be in your matlab memory

这里有5000个样例在这个mat文件里，每一个样例是一个20*20pixel的数字的灰度图像。每一个像素由一个浮点数代表了这个位置的灰度密度。20*20的像素矩阵被展开成一个400维度的向量。矩阵X的每一行是一个样例。所以可以推断出X是一个5000*400的矩阵。每一行是一个手写数字图像的样例。y是一个5000维的向量。同时我们用‘0’代表10。

（2）可视化数据

随机从X挑选100行数据，并且每一行数据是以一个20*20像素的灰度图片展示的，然后传给displayData.m这个函数，虽然我们无需在displayData.m这个函数里面添加代码，但是我们还是有必要查看的。

function [h, display_array] = displayData(X, example_width)
%DISPLAYDATA Display 2D data in a nice grid
%   [h, display_array] = DISPLAYDATA(X, example_width) displays 2D data
%   stored in X in a nice grid. It returns the figure handle h and the 
%   displayed array if requested.

% Set example_width automatically if not passed in
if ~exist('example_width', 'var') || isempty(example_width) %exist('name','kind')返回0，当这个name不存在，返回1，当这个name是一个存在于工作空间的变量,判断example_width是否为空
	example_width = round(sqrt(size(X, 2)));%round到最近的整数，如果是复数，实数部分和虚数部分分别来round,原本是400，开根号之后在round变成20，宽度就是开根号round
end

% Gray Image
colormap(gray);%灰色的

% Compute rows, cols
[m n] = size(X);
example_height = (n / example_width);%高度就是n除以宽度

% Compute number of items to display，计算展示多少个方块
display_rows = floor(sqrt(m));%向负无穷取整，下取整
display_cols = ceil(m / display_rows);%向正无穷取整，上取整

% Between images padding
pad = 1;

% Setup blank display
display_array = - ones(pad + display_rows * (example_height + pad), ...
                       pad + display_cols * (example_width + pad));

% Copy each example into a patch on the display array
curr_ex = 1;
for j = 1:display_rows
	for i = 1:display_cols
		if curr_ex > m, 
			break; 
		end
		% Copy the patch
		
		% Get the max value of the patch
		max_val = max(abs(X(curr_ex, :))); %每一个样例的最大的像素
		display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ...
		              pad + (i - 1) * (example_width + pad) + (1:example_width)) = ...
						reshape(X(curr_ex, :), example_height, example_width) / max_val;%reshape(A,m,n)将A按照列优先的顺序表示成m*n维
		curr_ex = curr_ex + 1;
	end
	if curr_ex > m, 
		break; 
	end
end

% Display Image
h = imagesc(display_array, [-1 1]);%在matlab里面输入help imagesc回详细的介绍这个函数的功能，C的每一个元素都对应一个patch，然后每一个元素值对应colormap里面的颜色就是patch的颜色
%为了提高中心的分辨率，会设置颜色限制[-1 1]，doc里面例子会有详细解释
% Do not show axis
axis image off

drawnow;

end