P=[1 1 0 ;0 1 1]; T=[0 1 0]; w=[0 0 ]; [S,Q]=size(T) b=0; A=purelin(w*P+b); e=T-A; LP.lr=maxlinlr(P) %误差平方和 sse=sumsqr(e); while sse>0.0000001 dW = learnwh([],P,[],[],[],[],e,[],[],[],LP,[]); dB=learnwh(b,ones(1,Q),[],[],[],[],e,[],[],[],LP,[]); w=w+dW; b=b+dB; A=purelin(w*P+b) e=T-A sse=sumsqr(e) end
上面是对w-h学习规则(最小均方差)的应用,也可以使用newlin,然后再使用train进行上述训练
newlind是属于零误差线性网络
>> A=purelin(w*p1+b)
A =
1.0000 0.0002
>> p1
p1 =
1 0
1 1
>>
help sse
SSE Sum squared error performance function.
Syntax
perf = sse(E,Y,X,FP)
dPerf_dy = sse('dy',E,Y,X,perf,FP);
dPerf_dx = sse('dx',E,Y,X,perf,FP);
info = sse(code)
Description
SSE is a network performance function. It measures
performance according to the sum of squared errors.
SSE(E,Y,X,PP) takes E and optional function parameters,
E - Matrix or cell array of error vectors.
Y - Matrix or cell array of output vectors. (ignored).
X - Vector of all weight and bias values (ignored).
FP - Function parameters (ignored).
and returns the sum squared error.
SSE('dy',E,Y,X,PERF,FP) returns derivative of PERF with respect to Y.
SSE('dx',E,Y,X,PERF,FP) returns derivative of PERF with respect to X.
SSE('name') returns the name of this function.
SSE('pnames') returns the name of this function.
SSE('pdefaults') returns the default function parameters.
Examples
Here a two layer feed-forward is created with a 1-element input
ranging from -10 to 10, four hidden TANSIG neurons, and one
PURELIN output neuron.
net = newff([-10 10],[4 1],{'tansig','purelin'});
Here the network is given a batch of inputs P. The error
is calculated by subtracting the output A from target T.
Then the sum squared error is calculated.
p = [-10 -5 0 5 10];
t = [0 0 1 1 1];
y = sim(net,p)
e = t-y
perf = sse(e)
Note that SSE can be called with only one argument because
the other arguments are ignored. SSE supports those arguments
to conform to the standard performance function argument list.
Network Use
To prepare a custom network to be trained with SSE set
NET.performFcn to 'sse'. This will automatically set
NET.performParam to the empty matrix [], as SSE has no
performance parameters.
Calling TRAIN or ADAPT will result in SSE being used to calculate
performance.
help learnwh
LEARNWH Widrow-Hoff weight/bias learning function.
Syntax
[dW,LS] = learnwh(W,P,Z,N,A,T,E,gW,gA,D,LP,LS)
[db,LS] = learnwh(b,ones(1,Q),Z,N,A,T,E,gW,gA,D,LP,LS)
info = learnwh(code)
Description
LEARNWH is the Widrow-Hoff weight/bias learning function,
and is also known as the delta or least mean squared (LMS) rule.
LEARNWH(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) takes several inputs,
W - SxR weight matrix (or b, an Sx1 bias vector).
P - RxQ input vectors (or ones(1,Q)).
Z - SxQ weighted input vectors.
N - SxQ net input vectors.
A - SxQ output vectors.
T - SxQ layer target vectors.
E - SxQ layer error vectors.
gW - SxR gradient with respect to performance.
gA - SxQ output gradient with respect to performance.
D - SxS neuron distances.
LP - Learning parameters, none, LP = [].
LS - Learning state, initially should be = [].
and returns,
dW - SxR weight (or bias) change matrix.
LS - New learning state.
Learning occurs according to LEARNWH's learning parameter,
shown here with its default value.
LP.lr - 0.01 - Learning rate
LEARNWH(CODE) returns useful information for each CODE string:
'pnames' - Returns names of learning parameters.
'pdefaults' - Returns default learning parameters.
'needg' - Returns 1 if this function uses gW or gA.
Examples
Here we define a random input P and error E to a layer
with a 2-element input and 3 neurons. We also define the
learning rate LR learning parameter.
p = rand(2,1);
e = rand(3,1);
lp.lr = 0.5;
Since LEARNWH only needs these values to calculate a weight
change (see Algorithm below), we will use them to do so.
dW = learnwh([],p,[],[],[],[],e,[],[],[],lp,[])
Network Use
You can create a standard network that uses LEARNWH with NEWLIN.
To prepare the weights and the bias of layer i of a custom network
to learn with LEARNWH:
1) Set NET.trainFcn to 'trainb'.
NET.trainParam will automatically become TRAINB's default parameters.
2) Set NET.adaptFcn to 'trains'.
NET.adaptParam will automatically become TRAINS's default parameters.
3) Set each NET.inputWeights{i,j}.learnFcn to 'learnwh'.
Set each NET.layerWeights{i,j}.learnFcn to 'learnwh'.
Set NET.biases{i}.learnFcn to 'learnwh'.
Each weight and bias learning parameter property will automatically
be set to LEARNWH's default parameters.
To train the network (or enable it to adapt):
1) Set NET.trainParam (NET.adaptParam) properties to desired values.
2) Call TRAIN (ADAPT).
See NEWLIN for adaption and training examples.
Algorithm
LEARNWH calculates the weight change dW for a given neuron from the
neuron's input P and error E, and the weight (or bias) learning
rate LR, according to the Widrow-Hoff learning rule:
dw = lr*e*pn'
>> help purelin
PURELIN Linear transfer function.
Syntax
A = purelin(N,FP)
dA_dN = purelin('dn',N,A,FP)
INFO = purelin(CODE)
Description
PURELIN is a neural transfer function. Transfer functions
calculate a layer's output from its net input.
PURELIN(N,FP) takes N and optional function parameters,
N - SxQ matrix of net input (column) vectors.
FP - Struct of function parameters (ignored).
and returns A, an SxQ matrix equal to N.
PURELIN('dn',N,A,FP) returns SxQ derivative of A w-respect to N.
If A or FP are not supplied or are set to [], FP reverts to
the default parameters, and A is calculated from N.
PURELIN('name') returns the name of this function.
PURELIN('output',FP) returns the [min max] output range.
PURELIN('active',FP) returns the [min max] active input range.
PURELIN('fullderiv') returns 1 or 0, whether DA_DN is SxSxQ or SxQ.
PURELIN('fpnames') returns the names of the function parameters.
PURELIN('fpdefaults') returns the default function parameters.
Examples
Here is the code to create a plot of the PURELIN transfer function.
n = -5:0.1:5;
a = purelin(n);
plot(n,a)
Here we assign this transfer function to layer i of a network.
net.layers{i}.transferFcn = 'purelin';
Algorithm
a = purelin(n) = n
help maxlinlr
MAXLINLR Maximum learning rate for a linear layer.
Syntax
lr = maxlinlr(P)
lr = maxlinlr(P,'bias')
Description
MAXLINLR is used to calculate learning rates for NEWLIN.
MAXLINLR(P) takes one argument,
P - RxQ matrix of input vectors.
and returns the maximum learning rate for a linear layer
without a bias that is to be trained only on the vectors in P.
MAXLINLR(P,'bias') return the maximum learning rate for
a linear layer with a bias.
Examples
Here we define a batch of 4 2-element input vectors and
find the maximum learning rate for a linear layer with
a bias.
P = [1 2 -4 7; 0.1 3 10 6];
lr = maxlinlr(P,'bias')