UFLDL softmax_regression.m

最新推荐文章于 2020-07-16 01:25:05 发布

hello_pig1995

最新推荐文章于 2020-07-16 01:25:05 发布

阅读量527

点赞数

CC 4.0 BY-SA版权

分类专栏： UFLDL 文章标签：机器学习

本文链接：https://blog.youkuaiyun.com/Zhaohui1995_Yang/article/details/52151867

UFLDL 专栏收录该内容

19 篇文章

订阅专栏

本文详细介绍了Softmax回归的实现过程，包括目标函数的计算及梯度的求解，并通过具体的代码示例展示了如何使用向量化代码来提高计算效率。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Reference:
[1]http://blog.youkuaiyun.com/songrotek/article/details/41310861
[2]http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/

在计算loss损失的时候，只需要对于需要预测的类别进行loss的计算，而bp的时候则需要考虑到错误的类别。

Reached Maximum Number of Iterations
Optimization took 69.236617 seconds.
Training accuracy: 94.4%
Test accuracy: 92.1%

function [f,g] = softmax_regression(theta, X,y)
  %
  % Arguments:
  %   theta - A vector containing the parameter values to optimize.
  %       In minFunc, theta is reshaped to a long vector.  So we need to
  %       resize it to an n-by-(num_classes-1) matrix.
  %       Recall that we assume theta(:,num_classes) = 0.
  %
  %   X - The examples stored in a matrix.  
  %       X(i,j) is the i'th coordinate of the j'th example.
  %   y - The label for each example.  y(j) is the j'th example's label.
  %
  m=size(X,2);
  n=size(X,1);

  % theta is a vector;  need to reshape to n x num_classes.
  theta=reshape(theta, n, []);
  num_classes=size(theta,2)+1;

  % initialize objective value and gradient.
  f = 0;
  g = zeros(size(theta));

  %
  % TODO:  Compute the softmax objective function and gradient using vectorized code.
  %        Store the objective function value in 'f', and the gradient in 'g'.
  %        Before returning g, make sure you form it back into a vector with g=g(:);
  %
%%% YOUR CODE HERE %%%

   theta = [theta,zeros(785,1)];
  predict = exp(theta' * X);
  predict = bsxfun(@rdivide,predict,sum(predict));

  I = sub2ind(size(predict),y,1:size(predict,2));
  f = f - sum(log(predict(I)));
  %f = f + sum(log(1-predict(setdiff(1:m*n,I))));

  delta = full(sparse(y,1:m,1))-predict;
  g = -X * delta';
  g = g(:,1:9);

  g=g(:); % make gradient a vector for minFunc