function Y = vl_nnsoftmax(X,dzdY) %VL_NNSOFTMAX CNN softmax. % Y = VL_NNSOFTMAX(X) applies the softmax operator the data X. X % has dimension H x W x D x N, packing N arrays of W x H % D-dimensional vectors. % % D can be thought of as the number of possible classes and the % function computes the softmax along the D dimension. Often W=H=1, % but this is not a requirement, as the operator is applied % convolutionally at all spatial locations. % % DZDX = VL_NNSOFTMAX(X, DZDY) computes the derivative of the block % projected onto DZDY. DZDX and DZDY have the same dimensions as % X and Y respectively. % Copyright (C) 2014 Andrea Vedaldi. % All rights reserved. % % This file is part of the VLFeat library and is made available under % the terms of the BSD license (see the COPYING file). E = exp(bsxfun(@minus, X, max(X,[],3))) ; L = sum(E,3) ; Y = bsxfun(@rdivide, E, L) ; if nargin <= 1, return ; end % backward Y = Y .* bsxfun(@minus, dzdY, sum(dzdY .* Y, 3)) ;
vl_nnsoftmax函数的主要作用就是将最后一层全连接层的输出score转化为(0,1)之间的概率值,形成概率分布。
其中在前向传播阶段,即为
相应源码为:
E = exp(bsxfun(@minus, X, max(X,[],3))) ; L = sum(E,3) ; Y = bsxfun(@rdivide, E, L) ; if nargin <= 1, return ; end
在反向传播阶段,即计算softmax的偏导,如下:
相应的源码为:
其中,dzdY初始化值为1,即Y=Y.*(1-Y) 。% backward Y = Y .* bsxfun(@minus, dzdY, sum(dzdY .* Y, 3)) ;