矩阵的分解
1、特征值分解(EVD-eigen value decomposition)
1.1特征值与特征向量
设A\boldsymbol{A}A是nnn阶矩阵,λ\lambdaλ是一个数,若存在nnn维非零向量x\boldsymbol{x}x,使得Ax=λx\boldsymbol{Ax}=\lambda\boldsymbol{x}Ax=λx,则称λ\lambdaλ是矩阵A\boldsymbol{A}A的特征值,x\boldsymbol{x}x是A\boldsymbol{A}A对应于λ\lambdaλ的特征向量。
1.2特征值分解
我们知道一个矩阵是可以通过特征值和特征向量来表示,那假设存在一个n×nn×nn×n的满秩矩阵A\boldsymbol{A}A,我们便可以通过特征值将A\boldsymbol{A}A分解。
A=UΛU−1=UΛUT\boldsymbol{A}=\boldsymbol{U}\boldsymbol{\Lambda}\boldsymbol{U}^{-1}=\boldsymbol{U}\boldsymbol{\Lambda}\boldsymbol{U}^{T}A=UΛU−1=UΛUT
其中,U\boldsymbol{U}U为特征向量组成的标准正交矩阵,Λ\boldsymbol{\Lambda}Λ为特征值组成的对角阵。
2、奇异值分解(SVD-singular value decomposition)
在特征值分解时,A\boldsymbol{A}A是n×nn×nn×n的满秩矩阵,那如果A\boldsymbol{A}A是一个m×nm×nm×n的普通矩阵时,再想分解矩阵A\boldsymbol{A}A就需要SVD了。此时的A\boldsymbol{A}A虽然只是一个m×nm×nm×n的普通矩阵,但是ATA\boldsymbol{A}^{T}\boldsymbol{A}ATA是一个n×nn×nn×n的对称阵,可以根据EVD来分解ATA\boldsymbol{A}^{T}\boldsymbol{A}ATA
由特征值分解可知:
ATAV=VΛATA=VΛVT\boldsymbol{A}^{T}\boldsymbol{AV}=\boldsymbol{V\Lambda}\\\boldsymbol{A}^{T}\boldsymbol{A}=\boldsymbol{V\Lambda}\boldsymbol{V}^{T}ATAV=VΛATA=VΛVT
其中,V\boldsymbol{V}V为特征向量组成的标准正交矩阵,Λ\boldsymbol{\Lambda}Λ为特征值组成的对角阵。
对于AV\boldsymbol{AV}AV,有(Avi)TAvj=viTATAvj=viTλjvj=0(\boldsymbol{Av}_{i})^{T}\boldsymbol{Av}_{j}=\boldsymbol{v}^{T}_{i}\boldsymbol{A}^{T}\boldsymbol{A}\boldsymbol{v}_{j}=\boldsymbol{v}^{T}_{i}\lambda_{j}\boldsymbol{v}_{j}=\boldsymbol{0}(Avi)TAvj=viTATAvj=viTλjvj=0,向量两两正交,满足正交阵第一个条件;(Avi)TAvi=viTATAvi=viTλivi=λi(\boldsymbol{Av}_{i})^{T}\boldsymbol{Av}_{i}=\boldsymbol{v}^{T}_{i}\boldsymbol{A}^{T}\boldsymbol{A}\boldsymbol{v}_{i}=\boldsymbol{v}^{T}_{i}\lambda_{i}\boldsymbol{v}_{i}=\lambda_{i}(Avi)TAvi=viTATAvi=viTλivi=λi,∣∣Avi)∣∣2=λi||\boldsymbol{Av}_{i})||^{2}=\lambda_{i}∣∣Avi)∣∣2=λi,将Avi\boldsymbol{Av}_{i}Avi单位化,令σi=λi\sigma_{i}=\sqrt{\lambda_{i}}σi=λi,则Avi∣∣Avi∣∣=Aviσi=ui\frac{\boldsymbol{Av}_{i}}{||\boldsymbol{Av}_{i}||}=\frac{\boldsymbol{Av}_{i}}{\sigma_{i}}=\boldsymbol{u}_{i}∣∣Avi∣∣Avi=σiAvi=ui,即Avi=σiui\boldsymbol{Av}_{i}=\sigma_{i}\boldsymbol{u}_{i}Avi=σiui,至此,各向量长度为单位长度,满足正交阵第二个条件。
综上所述,m×nm×nm×n的矩阵A\boldsymbol{A}A可以分解为:
A=UΣVT\boldsymbol{A}=\boldsymbol{U}\boldsymbol{\Sigma}\boldsymbol{V}^{T}A=UΣVT
其中,U\boldsymbol{U}U为AAT\boldsymbol{A}\boldsymbol{A}^{T}AAT的特征向量,V\boldsymbol{V}V为ATA\boldsymbol{A}^{T}\boldsymbol{A}ATA的特征向量,Σ\boldsymbol{\Sigma}Σ为对角元素为σi\sigma_{i}σi的斜对角阵。
3、QR分解
若nnn阶非奇异矩阵An×n\boldsymbol{A}_{n \times n}An×n可以分解成正交矩阵Qn×n\boldsymbol{Q}_{n\times n}Qn×n和非奇异上三角矩阵Rn×n\boldsymbol{R}_{n \times n}Rn×n的乘积,即A=QR\boldsymbol{A}=\boldsymbol{QR}A=QR,则称该分解为QR\boldsymbol{QR}QR分解
对于m×nm \times nm×n的列满秩矩阵A\boldsymbol{A}A,有Am×n=Qm×n⋅Rn×n\boldsymbol{A}_{m \times n}=\boldsymbol{Q}_{m \times n}\cdot \boldsymbol{R} _{n \times n}Am×n=Qm×n⋅Rn×n 。其中Q\boldsymbol{Q}Q为正交向量组,R\boldsymbol{R}R为非奇异上三角矩阵,该分解也叫做QR\boldsymbol{QR}QR分解。
施密特正交化:
设列向量α1,α2,α3,...,αk\boldsymbol{\alpha}_{1},\boldsymbol{\alpha}_{2},\boldsymbol{\alpha}_{3},...,\boldsymbol{\alpha}_{k}α1,α2,α3,...,αk线性无关,令:
β1=α1β2=α2−(β1,α2)(β1,β1)β1β3=α3−(β1,α3)(β1,β1)β1−(β2,α3)(β2,β2)β2...βk=αk−(β1,αk)(β1,β1)β1−(β2,αk)(β2,β2)β2−...−(βk−1,αk)(βk−1,βk−1)βk−1\begin{aligned}\boldsymbol{\beta}_{1}&=\boldsymbol{\alpha}_{1}\\
\boldsymbol{\beta}_{2}&=\boldsymbol{\alpha}_{2}-\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{2})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}\\
\boldsymbol{\beta}_{3}&=\boldsymbol{\alpha}_{3}-\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}-\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}\boldsymbol{\beta}_{2}\\...\\\boldsymbol{\beta}_{k}&=\boldsymbol{\alpha}_{k}-\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}-\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}\boldsymbol{\beta}_{2}-...-\frac{(\boldsymbol{\beta}_{k-1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{k-1},\boldsymbol{\beta}_{k-1})}\boldsymbol{\beta}_{k-1}\end{aligned}β1β2β3...βk=α1=α2−(β1,β1)(β1,α2)β1=α3−(β1,β1)(β1,α3)β1−(β2,β2)(β2,α3)β2=αk−(β1,β1)(β1,αk)β1−(β2,β2)(β2,αk)β2−...−(βk−1,βk−1)(βk−1,αk)βk−1
则β1,β2,β3,...,βk\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{3},...,\boldsymbol{\beta}_{k}β1,β2,β3,...,βk两两正交,与α1,α2,α3,...,αk\boldsymbol{\alpha}_{1},\boldsymbol{\alpha}_{2},\boldsymbol{\alpha}_{3},...,\boldsymbol{\alpha}_{k}α1,α2,α3,...,αk等价
令:
η1=β1∣∣β1∣∣η2=β2∣∣β2∣∣η3=β3∣∣β3∣∣...ηk=βk∣∣βk∣∣\boldsymbol{\eta}_{1}=\frac{\boldsymbol{\beta}_{1}}{||\boldsymbol{\beta}_{1}||}\\
\boldsymbol{\eta}_{2}=\frac{\boldsymbol{\beta}_{2}}{||\boldsymbol{\beta}_{2}||}\\\boldsymbol{\eta}_{3}=\frac{\boldsymbol{\beta}_{3}}{||\boldsymbol{\beta}_{3}||}\\...\\\boldsymbol{\eta}_{k}=\frac{\boldsymbol{\beta}_{k}}{||\boldsymbol{\beta}_{k}||}η1=∣∣β1∣∣β1η2=∣∣β2∣∣β2η3=∣∣β3∣∣β3...ηk=∣∣βk∣∣βk
则η1,η2,η3,...,ηk\boldsymbol{\eta}_{1},\boldsymbol{\eta}_{2},\boldsymbol{\eta}_{3},...,\boldsymbol{\eta}_{k}η1,η2,η3,...,ηk两两正交,并且均为单位向量,是与α1,α2,α3,...,αk\boldsymbol{\alpha}_{1},\boldsymbol{\alpha}_{2},\boldsymbol{\alpha}_{3},...,\boldsymbol{\alpha}_{k}α1,α2,α3,...,αk等价的标准正交组
系数矩阵:
由α和β\boldsymbol{\alpha}和\boldsymbol{\beta}α和β的关系可知:
α1=β1=∣∣β1∣∣η1α2=(β1,α2)(β1,β1)β1+β2=(β1,α2)(β1,β1)∣∣β1∣∣η1+∣∣β2∣∣η2α3=(β1,α3)(β1,β1)β1+(β2,α3)(β2,β2)β2+β3=(β1,α3)(β1,β1)∣∣β1∣∣η1+(β2,α3)(β2,β2)∣∣β2∣∣η2+∣∣β3∣∣η3...αk=(β1,αk)(β1,β1)β1+(β2,αk)(β2,β2)β2+...+(βk−1,αk)(βk−1,βk−1)βk−1+βk=(β1,αk)(β1,β1)∣∣β1∣∣η1+(β2,αk)(β2,β2)∣∣β2∣∣η2+...+(βk−1,αk)(βk−1,βk−1)∣∣βk−1∣∣ηk−1+∣∣βk∣∣ηk\begin{aligned}\boldsymbol{\alpha}_{1}&=\boldsymbol{\beta}_{1}=||\boldsymbol{\beta}_{1}||\boldsymbol{\eta}_{1}\\
\boldsymbol{\alpha}_{2}&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{2})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}+\boldsymbol{\beta}_{2}\\&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{2})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||\boldsymbol{\eta}_{1}+||\boldsymbol{\beta}_{2}||\boldsymbol{\eta}_{2}\\
\boldsymbol{\alpha}_{3}&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}+\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}\boldsymbol{\beta}_{2}+\boldsymbol{\beta}_{3}\\&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||\boldsymbol{\eta}_{1}+\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}||\boldsymbol{\beta}_{2}||\boldsymbol{\eta}_{2}+||\boldsymbol{\beta}_{3}||\boldsymbol{\eta}_{3}\\
...\\\boldsymbol{\alpha}_{k}&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}\boldsymbol{\beta}_{1}+\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}\boldsymbol{\beta}_{2}+...+\frac{(\boldsymbol{\beta}_{k-1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{k-1},\boldsymbol{\beta}_{k-1})}\boldsymbol{\beta}_{k-1}+\boldsymbol{\beta}_{k}\\&=\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||\boldsymbol{\eta}_{1}+\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}||\boldsymbol{\beta}_{2}||\boldsymbol{\eta}_{2}+...+\frac{(\boldsymbol{\beta}_{k-1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{k-1},\boldsymbol{\beta}_{k-1})}||\boldsymbol{\beta}_{k-1}||\boldsymbol{\eta}_{k-1}+||\boldsymbol{\beta}_{k}||\boldsymbol{\eta}_{k}\end{aligned}α1α2α3...αk=β1=∣∣β1∣∣η1=(β1,β1)(β1,α2)β1+β2=(β1,β1)(β1,α2)∣∣β1∣∣η1+∣∣β2∣∣η2=(β1,β1)(β1,α3)β1+(β2,β2)(β2,α3)β2+β3=(β1,β1)(β1,α3)∣∣β1∣∣η1+(β2,β2)(β2,α3)∣∣β2∣∣η2+∣∣β3∣∣η3=(β1,β1)(β1,αk)β1+(β2,β2)(β2,αk)β2+...+(βk−1,βk−1)(βk−1,αk)βk−1+βk=(β1,β1)(β1,αk)∣∣β1∣∣η1+(β2,β2)(β2,αk)∣∣β2∣∣η2+...+(βk−1,βk−1)(βk−1,αk)∣∣βk−1∣∣ηk−1+∣∣βk∣∣ηk
因此:
r1=[∣∣β1∣∣00...0]Tr2=[(β1,α2)(β1,β1)∣∣β1∣∣∣∣β2∣∣0...0]Tr3=[(β1,α3)(β1,β1)∣∣β1∣∣(β2,α3)(β2,β2)∣∣β2∣∣∣∣β3∣∣...0]T...rk=[(β1,αk)(β1,β1)∣∣β1∣∣(β2,αk)(β2,β2)∣∣β2∣∣...(βk−1,αk)(βk−1,βk−1)∣∣βk−1∣∣∣∣βk∣∣ηk]T\begin{aligned}\boldsymbol{r}_{1}&=\begin{bmatrix}||\boldsymbol{\beta}_{1}||&0&0&...&0\end{bmatrix}^{T}\\
\boldsymbol{r}_{2}&=\begin{bmatrix}\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{2})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||&||\boldsymbol{\beta}_{2}||&0&...&0\end{bmatrix}^{T}\\
\boldsymbol{r}_{3}&=\begin{bmatrix}\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||&\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{3})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}||\boldsymbol{\beta}_{2}||&||\boldsymbol{\beta}_{3}||&...&0\end{bmatrix}^{T}\\
...\\\boldsymbol{r}_{k}&=\begin{bmatrix}\frac{(\boldsymbol{\beta}_{1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{1},\boldsymbol{\beta}_{1})}||\boldsymbol{\beta}_{1}||&\frac{(\boldsymbol{\beta}_{2},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{2},\boldsymbol{\beta}_{2})}||\boldsymbol{\beta}_{2}||&...&\frac{(\boldsymbol{\beta}_{k-1},\boldsymbol{\alpha}_{k})}{(\boldsymbol{\beta}_{k-1},\boldsymbol{\beta}_{k-1})}||\boldsymbol{\beta}_{k-1}||&||\boldsymbol{\beta}_{k}||\boldsymbol{\eta}_{k}\end{bmatrix}^{T}\end{aligned}r1r2r3...rk=[∣∣β1∣∣00...0]T=[(β1,β1)(β1,α2)∣∣β1∣∣∣∣β2∣∣0...0]T=[(β1,β1)(β1,α3)∣∣β1∣∣(β2,β2)(β2,α3)∣∣β2∣∣∣∣β3∣∣...0]T=[(β1,β1)(β1,αk)∣∣β1∣∣(β2,β2)(β2,αk)∣∣β2∣∣...(βk−1,βk−1)(βk−1,αk)∣∣βk−1∣∣∣∣βk∣∣ηk]T
QR分解:
1、写出矩阵A\boldsymbol{A}A的列向量α1,α2,α3,...,αk\boldsymbol{\alpha}_{1},\boldsymbol{\alpha}_{2},\boldsymbol{\alpha}_{3},...,\boldsymbol{\alpha}_{k}α1,α2,α3,...,αk
2、将A\boldsymbol{A}A的列向量施密特正交化得到正交向量组η1,η2,η3,...,ηk\boldsymbol{\eta}_{1},\boldsymbol{\eta}_{2},\boldsymbol{\eta}_{3},...,\boldsymbol{\eta}_{k}η1,η2,η3,...,ηk,由此构成矩阵Q\boldsymbol{Q}Q
3、把矩阵A\boldsymbol{A}A的列向量表示成正交向量组η1,η2,η3,...,ηk\boldsymbol{\eta}_{1},\boldsymbol{\eta}_{2},\boldsymbol{\eta}_{3},...,\boldsymbol{\eta}_{k}η1,η2,η3,...,ηk的线性组合,其中列向量r1,r2,r3,...,rk\boldsymbol{r}_{1},\boldsymbol{r}_{2},\boldsymbol{r}_{3},...,\boldsymbol{r}_{k}r1,r2,r3,...,rk构成系数矩阵R\boldsymbol{R}R
4、A=QR\boldsymbol{A}=\boldsymbol{QR}A=QR
例子: