对于给定的训练样本{xi,di}i=1N\lbrace x_i, d_i\rbrace _{i=1} ^{N}{xi,di}i=1N,最小二乘估计的正则化代价函数由下式定义:
ε(w)=12∑i=1N(di−wTXi)2+12λ∣∣w∣∣2 \varepsilon (w) = \frac{1}{2} \sum _{i=1} ^N(d_i - w^TX_i)^2 + \frac{1}{2} \lambda ||w||^2 ε(w)=21i=1∑N(di−wTXi)2+21λ∣∣w∣∣2
正则化项以w的形式简单地定义:
∣∣DF∣∣2=∣∣W∣∣2=WTW||DF||^2 = ||W||^2 = W^TW∣∣DF∣∣2=∣∣W∣∣2=WTW
关于权值向量W^\hat{W}W^都正则化解的预期响应d的表达式有:
W^=(Rxx+λI)−1rdx\hat{W} = (R_xx + \lambda I ) ^{-1} r_{dx}W^=(Rxx+λI)−1rdx
Rxx=∑i=1N∑j=1NXiXjTR_{xx} = \sum _{i=1} ^{N}\sum _{j=1} ^N X_iX_j^TRxx=i=1∑Nj=1∑NXiXjT
rdx=∑i=1NXidir_{dx} = \sum _{i=1} ^N X_i d_irdx=i=1∑NXidi
以训练样本{xi,di}i=1N\lbrace x_i, d_i\rbrace _{i=1} ^{N}{xi,di}i=1N的形式重申W^\hat{W}W^,有:
W^=(XTX+λI)−1XTd\hat{W} = (X^TX+ \lambda I ) ^{-1}X^TdW^=(XTX+λI)−1XTd
X为输入数据矩阵
把最小二乘估计看成一个"核机器",把它的核表示成内积的形式:
k(X,Xi)=<X,Xi>=XTXi,i=1,2,...,Nk(X,X_i) = <X,X_i>=X^TX_i,i=1,2,...,Nk(X,Xi)=<X,Xi>=XTXi,i=1,2,...,N
定义正则化最小二乘估计表示逼近函数:
Fλ(X)=∑i=1Naik(X,Xi)F_{\lambda}(X) = \sum_{i=1}^{N}a_i k(X,X_i)Fλ(X)=i=1∑Naik(X,Xi)