Matrix_Analysis

Matrix Analysis

refference:《Matrix Computations》Gene H. Golub & Charles F. Van Loan

2.4 The Singular Value Decompostion(SVD)

The practical and theoretical importance of SVD is hard to estimate. It plays a promient role in data analysis and characterization of the many matrix “nearness problems”.
Theorem 2.4.1 (Singular Value Decomposition). If AAA is a real mmm-by-nnn matrix, then there exists orthogonal matrices
U={u1,u2,⋯ ,un}∈Rm×mandV={v1,v2,⋯ ,vn}∈Rn×nU=\{u_1,u_2,\cdots,u_n\}\in \mathbb{R}^{m\times m} \quad \text{and} \quad V=\{v_1,v_2,\cdots,v_n\} \in \mathbb{R}^{n\times n}U={u1,u2,,un}Rm×mandV={v1,v2,,vn}Rn×n
such that
UTAV=Σ=[diag(σ1,σ2,⋯ ,σp),0] or [diag(σ1,σ2,⋯ ,σp)0],p⩽n, U^TAV= \Sigma = \left[\text{diag}(\sigma_1,\sigma_2,\cdots,\sigma_p),\mathbf{0}\right] \text{ or } \begin{bmatrix} \text{diag}(\sigma_1,\sigma_2,\cdots,\sigma_p)\\ \mathbf{0} \end{bmatrix}, \qquad p \leqslant n,UTAV=Σ=[diag(σ1,σ2,,σp),0] or [diag(σ1,σ2,,σp)0],pn,
where σ1⩾σ2⩾⋯⩾σp⩾0.\sigma_1\geqslant\sigma_2\geqslant\cdots\geqslant\sigma_p\geqslant0.σ1σ2σp0.
Proof: Let xxx and yyy ∈Rn\in \mathbb{R}^nRn be the unit 2-norm vectors satisfying Ax=σyAx=\sigma yAx=σy with σ=∣∣A∣∣22\sigma= ||A||^2_2σ=∣∣A22. According Orthogonal Complement, there exists V1V_1V1 and U1U_1U1 ∈Rn×(n−1)\in \mathbb{R}^{n \times (n-1)}Rn×(n1) so V={x∣V1}V=\{x \mid V_1\}V={xV1} and U={y∣U1}U=\{y\mid U_1\}U={yU1} are orthogonal. It is easy to verify that
UTAV=[σwT0B]=A1 U^TAV=\begin{bmatrix} \sigma & w^T \\ 0 & B \end{bmatrix} =A_1 UTAV=[σ0wTB]=A1
with w∈Rn−1w \in \mathbb{R}^{n-1}wRn1 and B∈R(n−1)×(n−1)B \in \mathbb{R}^{(n-1)\times (n-1)}BR(n1)×(n1). Since
∣∣A1[σwT]∣∣22⩾σ2+wTw||A_1\begin{bmatrix} \sigma \\ w^T \end{bmatrix}||_2^2 \geqslant \sigma^2+w^Tw∣∣A1[σwT]22σ2+wTw and ∣∣A1∣∣22=σ2||A_1||_2^2 = \sigma^2∣∣A122=σ2, so we must have w=0w=0w=0. An obvious induction argument completes the proof of the theorem.

Corollary 2.4.2. If UTAV=ΣU^TAV=\SigmaUTAV=Σ is the SVD of A∈Rm×nA \in \mathbb{R}^{m\times n}ARm×n and m⩾nm \geqslant nmn, then for i=1:ni=1:ni=1:n, Avi=σiviAv_i=\sigma_i v_iAvi=σivi and ATui=σiuA^Tu_i=\sigma_i uATui=σiu.
Remark. The singular values of a matrix AAA are the
lengths of the semiaxes of the hyperellipsoid EEE defined by E={Ax:∣∣x∣∣=1}E = \{ Ax : ||x||=1 \}E={Ax:∣∣x∣∣=1}. Furthermore, ATAvi=σi2viA^TAv_i =\sigma_i^2 v_iATAvi=σi2vi and AATui=σi2uiAA^Tu_i=\sigma_i^2 u_iAATui=σi2ui.
Corollary 2.4.3. If A∈Rm×nA\in \mathbb{R}^{m\times n}ARm×n, then
∣∣A∣∣2=σmax(A) and ∣∣A∣∣F=σ12+σ22+⋯+σn2. ||A||_2=\sigma_{max}(A) \text{ and } ||A||_F =\sqrt{\sigma_1^2+\sigma_2^2+\cdots+\sigma_n^2} . ∣∣A2=σmax(A) and ∣∣AF=σ12+σ22++σn2.
where p=min⁡{m,n}p=\min\{m,n\}p=min{m,n}.
Proof: These results follow immediately from the fact that ∣∣UTAV∣∣=∣∣A∣∣||U^TAV||=||A||∣∣UTAV∣∣=∣∣A∣∣ for both the 2-norm and Frobenius norm.
Corollary 2.4.7. If A∈Rm×nA \in \mathbb{R}^{m\times n}ARm×n, and rank(A)=rrank(A)=rrank(A)=r, then
A=∑i=1rσiuiviT.A=\sum_{i=1}^{r}\sigma_iu_iv_i^T.A=i=1rσiuiviT.

Norms of Vector and Matrix

Vector Norms

Definitions. A vector norm on Rn\mathbb{R}^nRn is a function denote by ∣∣⋅∣∣:Rn→R||\cdot||: \mathbb{R}^n \rightarrow \mathbb{R}∣∣∣∣:RnR that satisfying the following properties:

  1. ∣∣x∣∣⩾0,∀x∈Rn||x||\geqslant 0, \forall x \in \mathbb{R}^n∣∣x∣∣0,xRn, and ∣∣x∣∣=0 implies x=0||x||=0 \text{ implies } x=0∣∣x∣∣=0 implies x=0.
  2. ∣∣x+y∣∣⩽∣∣x∣∣+∣∣y∣∣,∀x,y∈Rn||x+y||\leqslant ||x||+||y||, \forall x,y \in \mathbb{R}^n∣∣x+y∣∣∣∣x∣∣+∣∣y∣∣,x,yRn.
  3. ∣∣αx∣∣=∣α∣⋅∣∣x∣∣,α∈R,x∈Rn||\alpha x||=|\alpha|\cdot||x||,\quad \alpha \in \mathbb{R}, x \in \mathbb{R}^{n}∣∣αx∣∣=α∣∣x∣∣,αR,xRn.
p-Norms

Definition. The p-norm of a vector x∈Rnx \in \mathbb{R}^nxRn is defined by
∣∣x∣∣p=(∣x1∣p+∣x2∣p+⋯+∣xn∣p)1p,p⩾1.||x||_p=(|x_1|^p+|x_2|^p+\cdots+|x_n|^p)^\frac{1}{p}, p\geqslant1.∣∣xp=(x1p+x2p++xnp)p1,p1.
The 1-norm, 2-norm,and ∞\infty-norms are
∣∣x∣∣1=∣x1∣+∣x2∣+⋯+∣xn∣,∣∣x∣∣2=(∣x1∣2+∣x2∣2+⋯+∣xn∣2)12=(xTx)12,∣∣x∣∣∞=max⁡1⩽i⩽n∣xi∣.\begin{align*} &||x||_1=|x_1|+|x_2|+\cdots+|x_n|,\\ &||x||_2=\left(|x_1|^2+|x_2|^2+\cdots+|x_n|^2\right)^\frac{1}{2}=(x^Tx)^{\frac{1}{2}},\\ &||x||_\infty=\max_{1\leqslant i \leqslant n}|x_i|. \end{align*} ∣∣x1=x1+x2++xn,∣∣x2=(x12+x22++xn2)21=(xTx)21,∣∣x=1inmaxxi∣.
A uint vector with respect to the norm ∣∣⋅∣∣||\cdot||∣∣∣∣ is a vector xxx satisfying ∣∣x∣∣=1||x||=1∣∣x∣∣=1.

Properties

Holder inequality. ∣xTy∣⩽∣∣x∣∣p⋅∣∣y∣∣q,1p+1q=1,p,q⩾0|x^Ty|\leqslant ||x||_p\cdot ||y||_q,\quad \frac{1}{p}+\frac{1}{q}=1,\quad p,q \geqslant 0xTy∣∣xp∣∣yq,p1+q1=1,p,q0.
Cauchy-Schwarz inequality. ∣xTy∣⩽∣∣x∣∣2⋅∣∣y∣∣2|x^Ty|\leqslant ||x||_2\cdot ||y||_2xTy∣∣x2∣∣y2
Equivalence. All norms on Rn\mathbb{R}^nRn j is equivalent, i.e. if ∣∣⋅∣∣p||\cdot||_p∣∣p and ∣∣⋅∣∣q||\cdot||_q∣∣q i are norms on Rn\mathbb{R}^nRn, there exists positive constant c1c_1c1 and c2c_2c2, such that
c1∣∣x∣∣q⩽∣∣x∣∣p⩽c2∣∣x∣∣qc_1||x||_q \leqslant ||x||_p \leqslant c_2||x||_q c1∣∣xq∣∣xpc2∣∣xq
for all x∈Rnx \in \mathbb{R}^nxRn.
Indeed,
∣∣x∣∣p⩽∣∣x∣∣q⋅n1p−1q,p,q>0.||x||_p \leqslant ||x||_q\cdot n^{\frac{1}{p}-\frac{1}{q}}, \quad p,q > 0.∣∣xp∣∣xqnp1q1,p,q>0.
Sepecially ( Decrease Monotonicity),
∣∣x∣∣p⩽∣∣x∣∣q,p⩾q>0||x||_p \leqslant ||x||_q, \quad p \geqslant q > 0∣∣xp∣∣xq,pq>0
and
∣∣x∣∣2⩽∣∣x∣∣1⩽n∣∣x∣∣2,∣∣x∣∣∞⩽∣∣x∣∣1⩽n∣∣x∣∣∞,∣∣x∣∣∞⩽∣∣x∣∣2⩽n∣∣x∣∣∞.\begin{align*} & ||x||_2 \leqslant ||x||_1 \leqslant \sqrt{n}||x||_2, \\ & ||x||_{\infty} \leqslant ||x||_1 \leqslant n||x||_{\infty},\\ & ||x||_{\infty} \leqslant ||x||_2 \leqslant \sqrt{n}||x||_{\infty}. \end{align*} ∣∣x2∣∣x1n∣∣x2,∣∣x∣∣x1n∣∣x,∣∣x∣∣x2n∣∣x.
Orthogonal Invariance. The 2-norm is preserved under orthogonal transformation, i.e. if Q∈Rn×nQ \in \mathbb{R}^{n\times n}QRn×n is orthogonal and x∈Rnx \in \mathbb{R}^nxRn, then
∣∣Qx∣∣2=∣∣x∣∣2.||Qx||_2=||x||_2.∣∣Qx2=∣∣x2.
Other.

  1. lim⁡p→∞∣∣x∣∣p=∣∣x∣∣∞\lim_{p \rightarrow \infty} ||x||_p = ||x||_{\infty}limp∣∣xp=∣∣x,
  2. ∣∣x∣∣1⋅∣∣x∣∣∞⩽1+n2∣∣x∣∣2,x∈Rn||x||_1\cdot||x||_{\infty} \leqslant \frac{1+\sqrt{n}}{2}||x||_2,\quad x \in \mathbb{R}^n∣∣x1∣∣x21+n∣∣x2,xRn,
  3. ∣∣x⊗y∣∣p=∣∣x∣∣p⋅∣∣y∣∣p||x \otimes y||_p = ||x||_p\cdot ||y||_p∣∣xyp=∣∣xp∣∣yp for all x∈Rnx \in \mathbb{R}^{n}xRn and y∈Rmy \in \mathbb{R}^myRm, p=1,2,∞p=1,2,\inftyp=1,2,.

Marix Norms

Definitions. A vector norm on Rm×n\mathbb{R}^{m\times n}Rm×n is a function denote by ∣∣⋅∣∣:Rm×n→R||\cdot||: \mathbb{R}^{m\times n} \rightarrow \mathbb{R}∣∣∣∣:Rm×nR satisfying the following properties:

  1. ∣∣A∣∣⩾0,∀A∈Rn||A||\geqslant 0, \forall A \in \mathbb{R}^n∣∣A∣∣0,ARn, and ∣∣A∣∣=0 implies A=0||A||=0 \text{ implies } A=0∣∣A∣∣=0 implies A=0,
  2. ∣∣A+B∣∣⩽∣∣A∣∣+∣∣B∣∣,∀A,B∈Rm×n||A+B||\leqslant ||A||+||B||, \forall A,B \in \mathbb{R}^{m\times n}∣∣A+B∣∣∣∣A∣∣+∣∣B∣∣,A,BRm×n,
  3. ∣∣αA∣∣=∣α∣⋅∣∣A∣∣,α∈R,A∈Rm×n||\alpha A||=|\alpha|\cdot||A||, \quad \alpha \in \mathbb{R}, A \in \mathbb{R}^{m\times n}∣∣αA∣∣=α∣∣A∣∣,αR,ARm×n.

Definition. (Consistency) ∣∣⋅∣∣||\cdot||∣∣∣∣ is a consistent matrix norm, if
∣∣AB∣∣⩽∣∣A∣∣⋅∣∣B∣∣,∀A∈Rm×n,B∈Rn×s.||AB||\leqslant ||A||\cdot||B||,\quad \forall A \in \mathbb{R}^{m\times n}, B \in \mathbb{R}^{n\times s}.∣∣AB∣∣∣∣A∣∣∣∣B∣∣,ARm×n,BRn×s.

Definition. (Vector Consistency) Let ∣∣x∣∣||x||∣∣x∣∣ be the norm on Rn\mathbb{R}^nRn and ∣∣A∣∣||A||∣∣A∣∣ be the norm on Rm×n\mathbb{R}^{m\times n}Rm×n. ∣∣A∣∣||A||∣∣A∣∣ is a consistent marix norm wit
∣∣Ax∣∣⩽∣∣A∣∣⋅∣∣x∣∣,∀A∈Rm×n,x∈Rn. ||Ax||\leqslant ||A||\cdot||x||,\quad \forall A \in \mathbb{R}^{m\times n}, x \in \mathbb{R}^{n}. ∣∣Ax∣∣∣∣A∣∣∣∣x∣∣,ARm×n,xRn.
Remark. The three norms above with respect to AB,A,BAB,A ,BAB,A,B is defined on different matrix spaces, thus three different matrix norms.

Frobenius norm

Definition. The Frobenius norm of A∈Rm×nA \in \mathbb{R}^{m\times n}ARm×n is defined by
∣∣A∣∣F=∑i=1m∑j=1n∣aij∣2. ||A||_F = \sqrt{\sum_{i=1}^{m}\sum_{j=1}^{n}|a_{ij}|^2} . ∣∣AF=i=1mj=1naij2.

Properties

1. ∣∣A∣∣F=trace(ATA)=∑i=1min⁡{m,n}σi2||A||_F = \sqrt{trace(A^TA)} = \sqrt{\sum_{i=1}^{\min\{m,n\}}\sigma_i^2}∣∣AF=trace(ATA)=i=1min{m,n}σi2

p-norms

Definiton. The p-norm of A∈Rm×nA \in \mathbb{R}^{m\times n}ARm×n is defined by
∣∣A∣∣p=sup⁡x≠0∣∣Ax∣∣p∣∣x∣∣p.||A||_p = \sup_{x\neq 0} \frac{||Ax||_p}{||x||_p}.∣∣Ap=x=0sup∣∣xp∣∣Axp.
Remark. It is clear that ∣∣A∣∣p||A||_p∣∣Ap is the p-norm of the largest value obtained by applying AAA to aunit p-norm vector:
∣∣A∣∣p=sup⁡x≠0∣∣A(x∣∣x∣∣p)∣∣p=max⁡∣∣x∣∣p=1∣∣Ax∣∣p. ||A||_p =\sup_{x\neq 0} || A(\frac{x}{||x||_p}) ||_p = \max_{||x||_p=1}||Ax||_p. ∣∣Ap=x=0sup∣∣A(∣∣xpx)p=∣∣xp=1max∣∣Axp.
Remark. The matrix p-norms are defined in terms of the vetor p-norms discussed before.

Theorem. For p=1,2,∞p=1,2,\inftyp=1,2,, one has
∣∣A∣∣1=max⁡1⩽i⩽m∑j=1n∣aij∣,∣∣A∣∣∞=max⁡1⩽j⩽n∑i=1m∣aij∣,∣∣A∣∣2=λmax(ATA).\begin{align*} & ||A||_1 = \max_{1\leqslant i \leqslant m} \sum_{j=1}^{n} |a_{ij}|,\\ & ||A||_{\infty} = \max_{1\leqslant j \leqslant n} \sum_{i=1}^{m} |a_{ij}|,\\ & ||A||_2 = \sqrt{\lambda_{max}(A^TA)}. \end{align*} ∣∣A1=1immaxj=1naij,∣∣A=1jnmaxi=1maij,∣∣A2=λmax(ATA).
Corollary. ∣∣AT∣∣1=∣∣A∣∣∞||A^T||_1=||A||_{\infty}∣∣AT1=∣∣A.
Corollary2.3.2. If A∈Rm×nA \in \mathbb{R}^{m\times n}ARm×n, then ∣∣A∣∣2⩽∣∣A∣∣1⋅∣∣A∣∣∞||A||_2 \leqslant \sqrt{||A||_1\cdot||A||_{\infty}}∣∣A2∣∣A1∣∣A.

Properties

(Consistency). The matrix p-norms satisfy the consistency, i.e.∣∣AB∣∣⩽∣∣A∣∣⋅∣∣B∣∣,∀A∈Rm×n,B∈Rn×s||AB||\leqslant ||A||\cdot||B||, \forall A \in \mathbb{R}^{m\times n}, B \in \mathbb{R}^{n\times s}∣∣AB∣∣∣∣A∣∣∣∣B∣∣,ARm×n,BRn×s.
(Vector Consistency). The matrix p-norms satisfy the consistency, i.e.∣∣Ax∣∣⩽∣∣A∣∣⋅∣∣x∣∣,∀A∈Rm×n,x∈Rn||Ax||\leqslant ||A||\cdot||x||,\quad \forall A \in \mathbb{R}^{m\times n}, x \in \mathbb{R}^{n}∣∣Ax∣∣∣∣A∣∣∣∣x∣∣,ARm×n,xRn.
Equivalence. The Frobenious norm and p-norms satisfy the following inequalities.
1n∣∣A∣∣∞⩽∣∣A∣∣2⩽m∣∣A∣∣∞,1m∣∣A∣∣1⩽∣∣A∣∣2⩽n∣∣A∣∣1,∣∣A∣∣2⩽∣∣A∣∣F⩽min⁡{m,n}∣∣A∣∣2,max⁡i,j∣aij∣⩽∣∣A∣∣2⩽mnmax⁡i,j∣aij∣,∣∣A(i1:i2,j1:j2)∣∣p⩽∣∣A∣∣p,∀1⩽i1<i2⩽m,1⩽j1<j2⩽n.\begin{align*} & \frac{1}{\sqrt{n}}||A||_{\infty} \leqslant ||A||_2 \leqslant \sqrt{m} ||A||_{\infty},\\ & \frac{1}{m}||A||_1 \leqslant ||A||_2 \leqslant \sqrt{n}||A||_1,\\ & ||A||_2 \leqslant ||A||_F \leqslant \sqrt{\min\{m,n\}}||A||_2,\\ & \max_{i,j} |a_{ij}| \leqslant ||A||_2 \leqslant \sqrt{mn} \max_{i,j} |a_{ij}|,\\ & ||A(i_1:i_2,j_1:j_2)||_p \leqslant ||A||_p, \forall 1\leqslant i_1 < i_2 \leqslant m,1\leqslant j_1 < j_2 \leqslant n. \end{align*}n1∣∣A∣∣A2m∣∣A,m1∣∣A1∣∣A2n∣∣A1,∣∣A2∣∣AFmin{m,n}∣∣A2,i,jmaxaij∣∣A2mni,jmaxaij,∣∣A(i1:i2,j1:j2)p∣∣Ap,∀1i1<i2m,1j1<j2n.
Specially, ∣∣A∣∣F=∣∣A∣∣2||A||_F=||A||_2∣∣AF=∣∣A2 with n=1n=1n=1 or m=1m=1m=1.
Orthogonal Invariance. if A∈Rm×nA\in \mathbb{R}^{m\times n}ARm×n and the matrices Q∈Rm×mQ\in \mathbb{R}^{m\times m}QRm×m and Z∈Rn×nZ \in \mathbb{R}^{n\times n}ZRn×n are orthogonal, then
∣∣QAZ∣∣2=∣∣A∣∣2,∣∣QAZ∣∣F=∣∣A∣∣F.\begin{align*} & ||QAZ||_2 = ||A||_2,\\ & ||QAZ||_F = ||A||_F. \end{align*}∣∣QAZ2=∣∣A2,∣∣QAZF=∣∣AF.

Other Matrix Norm
Induced Norms (Operator Norm)

Suppose a vector norm ∣∣⋅∣∣α||\cdot||_{\alpha}∣∣α on Rn\mathbb{R}^nRn and a vector norm ∣∣⋅∣∣β||\cdot||_{\beta}∣∣β on Rm\mathbb{R}^mRm are given. A function ∣∣⋅∣∣(α,β):Rm×n→R||\cdot||_{(\alpha,\beta)}:\mathbb{R}^{m\times n}\rightarrow \mathbb{R}∣∣(α,β):Rm×nR is a matrix norm induced by vector norm ∣∣⋅∣∣α||\cdot||_{\alpha}∣∣α and ∣∣⋅∣∣β||\cdot||_{\beta}∣∣β defined by
∣∣A∣∣(α,β)=sup⁡∣∣x∣∣α=1∣∣Ax∣∣β.||A||_{(\alpha,\beta)} = \sup_{||x||_{\alpha}=1} ||Ax||_{\beta}.∣∣A(α,β)=∣∣xα=1sup∣∣Axβ.
Remark.

  1. when α=β=p\alpha=\beta=pα=β=p, it deduces matrix p-norm,
  2. All operator norms possess the Consistency.
Schatten Norms

∣∣A∣∣p=(∑i=1min⁡{m,n}σip)1p||A||_p = \left( \sum_{i=1}^{\min\{m,n\}} \sigma_i^p \right)^{\frac{1}{p}}∣∣Ap=i=1min{m,n}σipp1
Remark.

  1. when p=2p=2p=2, it deduces the Frobenius norm,
  2. All Schatten norms possess the Consistency,
  3. All Schatten norms possess the Orthogonal Invriance.
“Entry-wise” Norms

∣∣A∣∣p=(∑i=1m∑j=1n∣aij∣p)1p||A||_p= \left( \sum_{i=1}^{m} \sum_{j=1}^{n} |a_{ij}|^p \right)^{\frac{1}{p}}∣∣Ap=(i=1mj=1naijp)p1
Remark. The special case p=2p = 2p=2 is the Frobenius norm, and p=∞p = \inftyp= yields ∞\infty-norm.

Max Norm

∣∣A∣∣max=max⁡i,j∣aij∣||A||_{max}=\max_{i,j}|a_{ij}|∣∣Amax=i,jmaxaij

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值