matlab-线性网络训练过程-优快云博客

本文介绍了线性网络中Widrow-Hoff学习规则的应用，包括如何通过最小均方差法调整权重和偏置来减少预测误差，并展示了如何使用MATLAB函数实现这一过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

P=[1 1 0 ;0 1 1];
T=[0 1 0];
w=[0 0 ];
[S,Q]=size(T)
b=0;
A=purelin(w*P+b);
e=T-A;
LP.lr=maxlinlr(P)
%误差平方和
sse=sumsqr(e);
while sse>0.0000001
    dW = learnwh([],P,[],[],[],[],e,[],[],[],LP,[]);
    dB=learnwh(b,ones(1,Q),[],[],[],[],e,[],[],[],LP,[]);
    w=w+dW;
    b=b+dB;
    A=purelin(w*P+b)
    e=T-A
    sse=sumsqr(e)
end

上面是对w-h学习规则(最小均方差)的应用，也可以使用newlin，然后再使用train进行上述训练

newlind是属于零误差线性网络

>> A=purelin(w*p1+b)

A =

1.0000 0.0002

>> p1

p1 =

1 0
1 1

help sse
SSE Sum squared error performance function.

   Syntax

     perf = sse(E,Y,X,FP)
     dPerf_dy = sse('dy',E,Y,X,perf,FP);
     dPerf_dx = sse('dx',E,Y,X,perf,FP);
     info = sse(code)

   Description

     SSE is a network performance function. It measures
     performance according to the sum of squared errors.

     SSE(E,Y,X,PP) takes E and optional function parameters,
       E - Matrix or cell array of error vectors.
       Y - Matrix or cell array of output vectors. (ignored).
       X - Vector of all weight and bias values (ignored).
       FP - Function parameters (ignored).
      and returns the sum squared error.

     SSE('dy',E,Y,X,PERF,FP) returns derivative of PERF with respect to Y.
     SSE('dx',E,Y,X,PERF,FP) returns derivative of PERF with respect to X.

     SSE('name') returns the name of this function.
     SSE('pnames') returns the name of this function.
     SSE('pdefaults') returns the default function parameters.

   Examples

     Here a two layer feed-forward is created with a 1-element input
     ranging from -10 to 10, four hidden TANSIG neurons, and one
     PURELIN output neuron.

       net = newff([-10 10],[4 1],{'tansig','purelin'});

     Here the network is given a batch of inputs P. The error
     is calculated by subtracting the output A from target T.
     Then the sum squared error is calculated.

       p = [-10 -5 0 5 10];
       t = [0 0 1 1 1];
       y = sim(net,p)
       e = t-y
       perf = sse(e)

     Note that SSE can be called with only one argument because
     the other arguments are ignored. SSE supports those arguments
     to conform to the standard performance function argument list.

   Network Use

     To prepare a custom network to be trained with SSE set
     NET.performFcn to 'sse'. This will automatically set
     NET.performParam to the empty matrix [], as SSE has no
     performance parameters.

     Calling TRAIN or ADAPT will result in SSE being used to calculate
     performance.

help learnwh
LEARNWH Widrow-Hoff weight/bias learning function.

   Syntax

     [dW,LS] = learnwh(W,P,Z,N,A,T,E,gW,gA,D,LP,LS)
     [db,LS] = learnwh(b,ones(1,Q),Z,N,A,T,E,gW,gA,D,LP,LS)
     info = learnwh(code)

   Description

     LEARNWH is the Widrow-Hoff weight/bias learning function,
     and is also known as the delta or least mean squared (LMS) rule.

     LEARNWH(W,P,Z,N,A,T,E,gW,gA,D,LP,LS) takes several inputs,
       W - SxR weight matrix (or b, an Sx1 bias vector).
       P - RxQ input vectors (or ones(1,Q)).
       Z - SxQ weighted input vectors.
       N - SxQ net input vectors.
       A - SxQ output vectors.
       T - SxQ layer target vectors.
       E - SxQ layer error vectors.
       gW - SxR gradient with respect to performance.
       gA - SxQ output gradient with respect to performance.
       D - SxS neuron distances.
       LP - Learning parameters, none, LP = [].
       LS - Learning state, initially should be = [].
     and returns,
       dW - SxR weight (or bias) change matrix.
       LS - New learning state.

     Learning occurs according to LEARNWH's learning parameter,
     shown here with its default value.
       LP.lr - 0.01 - Learning rate

     LEARNWH(CODE) returns useful information for each CODE string:
       'pnames'    - Returns names of learning parameters.
       'pdefaults' - Returns default learning parameters.
       'needg'     - Returns 1 if this function uses gW or gA.

   Examples

     Here we define a random input P and error E to a layer
     with a 2-element input and 3 neurons. We also define the
     learning rate LR learning parameter.

       p = rand(2,1);
       e = rand(3,1);
       lp.lr = 0.5;

     Since LEARNWH only needs these values to calculate a weight
     change (see Algorithm below), we will use them to do so.

       dW = learnwh([],p,[],[],[],[],e,[],[],[],lp,[])

   Network Use

     You can create a standard network that uses LEARNWH with NEWLIN.

     To prepare the weights and the bias of layer i of a custom network
     to learn with LEARNWH:
     1) Set NET.trainFcn to 'trainb'.
        NET.trainParam will automatically become TRAINB's default parameters.
     2) Set NET.adaptFcn to 'trains'.
        NET.adaptParam will automatically become TRAINS's default parameters.
     3) Set each NET.inputWeights{i,j}.learnFcn to 'learnwh'.
        Set each NET.layerWeights{i,j}.learnFcn to 'learnwh'.
        Set NET.biases{i}.learnFcn to 'learnwh'.
        Each weight and bias learning parameter property will automatically
        be set to LEARNWH's default parameters.

     To train the network (or enable it to adapt):
     1) Set NET.trainParam (NET.adaptParam) properties to desired values.
     2) Call TRAIN (ADAPT).

     See NEWLIN for adaption and training examples.

   Algorithm

     LEARNWH calculates the weight change dW for a given neuron from the
     neuron's input P and error E, and the weight (or bias) learning
     rate LR, according to the Widrow-Hoff learning rule:

       dw = lr*e*pn'

>> help purelin
PURELIN Linear transfer function.

  Syntax

    A = purelin(N,FP)
    dA_dN = purelin('dn',N,A,FP)
    INFO = purelin(CODE)

  Description

    PURELIN is a neural transfer function. Transfer functions
    calculate a layer's output from its net input.

    PURELIN(N,FP) takes N and optional function parameters,
      N - SxQ matrix of net input (column) vectors.
      FP - Struct of function parameters (ignored).
    and returns A, an SxQ matrix equal to N.

    PURELIN('dn',N,A,FP) returns SxQ derivative of A w-respect to N.
    If A or FP are not supplied or are set to [], FP reverts to
    the default parameters, and A is calculated from N.

    PURELIN('name') returns the name of this function.
    PURELIN('output',FP) returns the [min max] output range.
    PURELIN('active',FP) returns the [min max] active input range.
    PURELIN('fullderiv') returns 1 or 0, whether DA_DN is SxSxQ or SxQ.
    PURELIN('fpnames') returns the names of the function parameters.
    PURELIN('fpdefaults') returns the default function parameters.

  Examples

    Here is the code to create a plot of the PURELIN transfer function.

      n = -5:0.1:5;
      a = purelin(n);
      plot(n,a)

    Here we assign this transfer function to layer i of a network.

      net.layers{i}.transferFcn = 'purelin';

  Algorithm

      a = purelin(n) = n

help maxlinlr
MAXLINLR Maximum learning rate for a linear layer.

   Syntax

     lr = maxlinlr(P)
     lr = maxlinlr(P,'bias')

   Description

     MAXLINLR is used to calculate learning rates for NEWLIN.

     MAXLINLR(P) takes one argument,
       P - RxQ matrix of input vectors.
     and returns the maximum learning rate for a linear layer
     without a bias that is to be trained only on the vectors in P.

     MAXLINLR(P,'bias') return the maximum learning rate for
     a linear layer with a bias.

   Examples

     Here we define a batch of 4 2-element input vectors and
     find the maximum learning rate for a linear layer with
     a bias.

       P = [1 2 -4 7; 0.1 3 10 6];
       lr = maxlinlr(P,'bias')