Matrix_Analysis_gene h.golub 的矩阵计算答案-优快云博客

本文链接：https://blog.youkuaiyun.com/matt1/article/details/138254218

Matrix Analysis

refference:《Matrix Computations》Gene H. Golub & Charles F. Van Loan

2.4 The Singular Value Decompostion(SVD)

The practical and theoretical importance of SVD is hard to estimate. It plays a promient role in data analysis and characterization of the many matrix “nearness problems”.
Theorem 2.4.1 (Singular Value Decomposition). If $A$ is a real $m$ -by- $n$ matrix, then there exists orthogonal matrices
$,vn}∈Rn×nU=\{u_1,u_2,\cdots,u_n\}\in \mathbb{R}^{m\times m} \quad \text{and} \quad V=\{v_1,v_2,\cdots,v_n\} \in \mathbb{R}^{n\times n}$
such that
$U^TAV= \Sigma = \left[\text{diag}(\sigma_1,\sigma_2,\cdots,\sigma_p),\mathbf{0}\right] \text{ or } \begin{bmatrix} \text{diag}(\sigma_1,\sigma_2,\cdots,\sigma_p)\\ \mathbf{0} \end{bmatrix}, \qquad p \leqslant n,$
where $σ1⩾σ2⩾⋯⩾σp⩾0.\sigma_1\geqslant\sigma_2\geqslant\cdots\geqslant\sigma_p\geqslant0.$
Proof： Let $x$ and $y$ $∈Rn\in \mathbb{R}^n$ be the unit 2-norm vectors satisfying $Ax=σyAx=\sigma y$ with $σ=∣∣A∣∣22\sigma= ||A||^2_2$ . According Orthogonal Complement, there exists $V_1$ and $U_1$ $∈Rn×(n−1)\in \mathbb{R}^{n \times (n-1)}$ so $V={x∣V1}V=\{x \mid V_1\}$ and $U={y∣U1}U=\{y\mid U_1\}$ are orthogonal. It is easy to verify that
$U^TAV=\begin{bmatrix} \sigma & w^T \\ 0 & B \end{bmatrix} =A_1$
with $\in \mathbb{R}^{n-1}$ and $\in \mathbb{R}^{(n-1)\times (n-1)}$ . Since
$∣∣A1[σwT]∣∣22⩾σ2+wTw||A_1\begin{bmatrix} \sigma \\ w^T \end{bmatrix}||_2^2 \geqslant \sigma^2+w^Tw$ and $∣∣A1∣∣22=σ2||A_1||_2^2 = \sigma^2$ , so we must have $w = 0$ . An obvious induction argument completes the proof of the theorem.

Corollary 2.4.2. If $UTAV=ΣU^TAV=\Sigma$ is the SVD of $\in \mathbb{R}^{m\times n}$ and $\geqslant n$ , then for $i = 1 : n$ , $Avi=σiviAv_i=\sigma_i v_i$ and $ATui=σiuA^Tu_i=\sigma_i u$ .
Remark. The singular values of a matrix $A$ are the
lengths of the semiaxes of the hyperellipsoid $E$ defined by $E = \{ Ax : ||x||=1 \}$ . Furthermore, $ATAvi=σi2viA^TAv_i =\sigma_i^2 v_i$ and $AATui=σi2uiAA^Tu_i=\sigma_i^2 u_i$ .
Corollary 2.4.3. If $A∈Rm×nA\in \mathbb{R}^{m\times n}$ , then
$||A||_2=\sigma_{max}(A) \text{ and } ||A||_F =\sqrt{\sigma_1^2+\sigma_2^2+\cdots+\sigma_n^2} .$
where $p=\min\{m,n\}$ .
Proof: These results follow immediately from the fact that $U^TAV||=||A||$ for both the 2-norm and Frobenius norm.
Corollary 2.4.7. If $\in \mathbb{R}^{m\times n}$ , and $r ank (A) = r$ , then
$A=∑i=1rσiuiviT.A=\sum_{i=1}^{r}\sigma_iu_iv_i^T.$

Norms of Vector and Matrix

Vector Norms

Definitions. A vector norm on $Rn\mathbb{R}^n$ is a function denote by $∣∣⋅∣∣:Rn→R||\cdot||: \mathbb{R}^n \rightarrow \mathbb{R}$ that satisfying the following properties:

$∣∣x∣∣⩾0,∀x∈Rn||x||\geqslant 0, \forall x \in \mathbb{R}^n$ , and $\text{ implies } x=0$ .
$∣∣x+y∣∣⩽∣∣x∣∣+∣∣y∣∣,∀x,y∈Rn||x+y||\leqslant ||x||+||y||, \forall x,y \in \mathbb{R}^n$ .
$∣∣αx∣∣=∣α∣⋅∣∣x∣∣,α∈R,x∈Rn||\alpha x||=|\alpha|\cdot||x||,\quad \alpha \in \mathbb{R}, x \in \mathbb{R}^{n}$ .

p-Norms

Definition. The p-norm of a vector $\in \mathbb{R}^n$ is defined by
$∣∣x∣∣p=(∣x1∣p+∣x2∣p+⋯+∣xn∣p)1p,p⩾1.||x||_p=(|x_1|^p+|x_2|^p+\cdots+|x_n|^p)^\frac{1}{p}, p\geqslant1.$
The 1-norm, 2-norm,and $∞\infty$ -norms are
$∣∣x∣∣1=∣x1∣+∣x2∣+⋯+∣xn∣,∣∣x∣∣2=(∣x1∣2+∣x2∣2+⋯+∣xn∣2)12=(xTx)12,∣∣x∣∣∞=max⁡1⩽i⩽n∣xi∣.\begin{align*} &||x||_1=|x_1|+|x_2|+\cdots+|x_n|,\\ &||x||_2=\left(|x_1|^2+|x_2|^2+\cdots+|x_n|^2\right)^\frac{1}{2}=(x^Tx)^{\frac{1}{2}},\\ &||x||_\infty=\max_{1\leqslant i \leqslant n}|x_i|. \end{align*}$
A uint vector with respect to the norm $∣∣⋅∣∣||\cdot||$ is a vector $x$ satisfying $∣∣ x ∣∣ = 1$ .

Properties

Holder inequality. $∣xTy∣⩽∣∣x∣∣p⋅∣∣y∣∣q,1p+1q=1,p,q⩾0|x^Ty|\leqslant ||x||_p\cdot ||y||_q,\quad \frac{1}{p}+\frac{1}{q}=1,\quad p,q \geqslant 0$ .
Cauchy-Schwarz inequality. $∣xTy∣⩽∣∣x∣∣2⋅∣∣y∣∣2|x^Ty|\leqslant ||x||_2\cdot ||y||_2$
Equivalence. All norms on $Rn\mathbb{R}^n$ j is equivalent, i.e. if $∣∣⋅∣∣p||\cdot||_p$ and $∣∣⋅∣∣q||\cdot||_q$ i are norms on $Rn\mathbb{R}^n$ , there exists positive constant $c_1$ and $c_2$ , such that
$c1∣∣x∣∣q⩽∣∣x∣∣p⩽c2∣∣x∣∣qc_1||x||_q \leqslant ||x||_p \leqslant c_2||x||_q$
for all $\in \mathbb{R}^n$ .
Indeed,
$∣∣x∣∣p⩽∣∣x∣∣q⋅n1p−1q,p,q>0.||x||_p \leqslant ||x||_q\cdot n^{\frac{1}{p}-\frac{1}{q}}, \quad p,q > 0.$
Sepecially ( Decrease Monotonicity),
$∣∣x∣∣p⩽∣∣x∣∣q,p⩾q>0||x||_p \leqslant ||x||_q, \quad p \geqslant q > 0$
and
$∣∣x∣∣2⩽∣∣x∣∣1⩽n∣∣x∣∣2,∣∣x∣∣∞⩽∣∣x∣∣1⩽n∣∣x∣∣∞,∣∣x∣∣∞⩽∣∣x∣∣2⩽n∣∣x∣∣∞.\begin{align*} & ||x||_2 \leqslant ||x||_1 \leqslant \sqrt{n}||x||_2, \\ & ||x||_{\infty} \leqslant ||x||_1 \leqslant n||x||_{\infty},\\ & ||x||_{\infty} \leqslant ||x||_2 \leqslant \sqrt{n}||x||_{\infty}. \end{align*}$
Orthogonal Invariance. The 2-norm is preserved under orthogonal transformation, i.e. if $\in \mathbb{R}^{n\times n}$ is orthogonal and $\in \mathbb{R}^n$ , then
$Qx||_2=||x||_2.$
Other.

$lim⁡p→∞∣∣x∣∣p=∣∣x∣∣∞\lim_{p \rightarrow \infty} ||x||_p = ||x||_{\infty}$ ,
$∣∣x∣∣1⋅∣∣x∣∣∞⩽1+n2∣∣x∣∣2,x∈Rn||x||_1\cdot||x||_{\infty} \leqslant \frac{1+\sqrt{n}}{2}||x||_2,\quad x \in \mathbb{R}^n$ ,
$\otimes y||_p = ||x||_p\cdot ||y||_p$ for all $\in \mathbb{R}^{n}$ and $\in \mathbb{R}^m$ , $p=1,2,∞p=1,2,\infty$ .

Marix Norms

Definitions. A vector norm on $Rm×n\mathbb{R}^{m\times n}$ is a function denote by $∣∣⋅∣∣:Rm×n→R||\cdot||: \mathbb{R}^{m\times n} \rightarrow \mathbb{R}$ satisfying the following properties:

$∣∣A∣∣⩾0,∀A∈Rn||A||\geqslant 0, \forall A \in \mathbb{R}^n$ , and $\text{ implies } A=0$ ,
$∣∣A+B∣∣⩽∣∣A∣∣+∣∣B∣∣,∀A,B∈Rm×n||A+B||\leqslant ||A||+||B||, \forall A,B \in \mathbb{R}^{m\times n}$ ,
$∣∣αA∣∣=∣α∣⋅∣∣A∣∣,α∈R,A∈Rm×n||\alpha A||=|\alpha|\cdot||A||, \quad \alpha \in \mathbb{R}, A \in \mathbb{R}^{m\times n}$ .

Definition. (Consistency) $∣∣⋅∣∣||\cdot||$ is a consistent matrix norm, if
$∣∣AB∣∣⩽∣∣A∣∣⋅∣∣B∣∣,∀A∈Rm×n,B∈Rn×s.||AB||\leqslant ||A||\cdot||B||,\quad \forall A \in \mathbb{R}^{m\times n}, B \in \mathbb{R}^{n\times s}.$

Definition. (Vector Consistency) Let $∣∣ x ∣∣$ be the norm on $Rn\mathbb{R}^n$ and $∣∣ A ∣∣$ be the norm on $Rm×n\mathbb{R}^{m\times n}$ . $∣∣ A ∣∣$ is a consistent marix norm wit
$||Ax||\leqslant ||A||\cdot||x||,\quad \forall A \in \mathbb{R}^{m\times n}, x \in \mathbb{R}^{n}.$
Remark. The three norms above with respect to $A B, A, B$ is defined on different matrix spaces, thus three different matrix norms.

Frobenius norm

Definition. The Frobenius norm of $\in \mathbb{R}^{m\times n}$ is defined by
$||A||_F = \sqrt{\sum_{i=1}^{m}\sum_{j=1}^{n}|a_{ij}|^2} .$

Properties

1. $∣∣A∣∣F=trace(ATA)=∑i=1min⁡{m,n}σi2||A||_F = \sqrt{trace(A^TA)} = \sqrt{\sum_{i=1}^{\min\{m,n\}}\sigma_i^2}$

p-norms

Definiton. The p-norm of $\in \mathbb{R}^{m\times n}$ is defined by
$∣∣A∣∣p=sup⁡x≠0∣∣Ax∣∣p∣∣x∣∣p.||A||_p = \sup_{x\neq 0} \frac{||Ax||_p}{||x||_p}.$
Remark. It is clear that $A||_p$ is the p-norm of the largest value obtained by applying $A$ to aunit p-norm vector:
$||A||_p =\sup_{x\neq 0} || A(\frac{x}{||x||_p}) ||_p = \max_{||x||_p=1}||Ax||_p.$
Remark. The matrix p-norms are defined in terms of the vetor p-norms discussed before.

Theorem. For $p=1,2,∞p=1,2,\infty$ , one has
$∣∣A∣∣1=max⁡1⩽i⩽m∑j=1n∣aij∣,∣∣A∣∣∞=max⁡1⩽j⩽n∑i=1m∣aij∣,∣∣A∣∣2=λmax(ATA).\begin{align*} & ||A||_1 = \max_{1\leqslant i \leqslant m} \sum_{j=1}^{n} |a_{ij}|,\\ & ||A||_{\infty} = \max_{1\leqslant j \leqslant n} \sum_{i=1}^{m} |a_{ij}|,\\ & ||A||_2 = \sqrt{\lambda_{max}(A^TA)}. \end{align*}$
Corollary. $∣∣AT∣∣1=∣∣A∣∣∞||A^T||_1=||A||_{\infty}$ .
Corollary2.3.2. If $\in \mathbb{R}^{m\times n}$ , then $∣∣A∣∣2⩽∣∣A∣∣1⋅∣∣A∣∣∞||A||_2 \leqslant \sqrt{||A||_1\cdot||A||_{\infty}}$ .

Properties

(Consistency). The matrix p-norms satisfy the consistency, i.e. $∣∣AB∣∣⩽∣∣A∣∣⋅∣∣B∣∣,∀A∈Rm×n,B∈Rn×s||AB||\leqslant ||A||\cdot||B||, \forall A \in \mathbb{R}^{m\times n}, B \in \mathbb{R}^{n\times s}$ .
(Vector Consistency). The matrix p-norms satisfy the consistency, i.e. $∣∣Ax∣∣⩽∣∣A∣∣⋅∣∣x∣∣,∀A∈Rm×n,x∈Rn||Ax||\leqslant ||A||\cdot||x||,\quad \forall A \in \mathbb{R}^{m\times n}, x \in \mathbb{R}^{n}$ .
Equivalence. The Frobenious norm and p-norms satisfy the following inequalities.
$1n∣∣A∣∣∞⩽∣∣A∣∣2⩽m∣∣A∣∣∞,1m∣∣A∣∣1⩽∣∣A∣∣2⩽n∣∣A∣∣1,∣∣A∣∣2⩽∣∣A∣∣F⩽min⁡{m,n}∣∣A∣∣2,max⁡i,j∣aij∣⩽∣∣A∣∣2⩽mnmax⁡i,j∣aij∣,∣∣A(i1:i2,j1:j2)∣∣p⩽∣∣A∣∣p,∀1⩽i1<i2⩽m,1⩽j1<j2⩽n.\begin{align*} & \frac{1}{\sqrt{n}}||A||_{\infty} \leqslant ||A||_2 \leqslant \sqrt{m} ||A||_{\infty},\\ & \frac{1}{m}||A||_1 \leqslant ||A||_2 \leqslant \sqrt{n}||A||_1,\\ & ||A||_2 \leqslant ||A||_F \leqslant \sqrt{\min\{m,n\}}||A||_2,\\ & \max_{i,j} |a_{ij}| \leqslant ||A||_2 \leqslant \sqrt{mn} \max_{i,j} |a_{ij}|,\\ & ||A(i_1:i_2,j_1:j_2)||_p \leqslant ||A||_p, \forall 1\leqslant i_1 < i_2 \leqslant m,1\leqslant j_1 < j_2 \leqslant n. \end{align*}$
Specially, $A||_F=||A||_2$ with $n = 1$ or $m = 1$ .
Orthogonal Invariance. if $A∈Rm×nA\in \mathbb{R}^{m\times n}$ and the matrices $Q∈Rm×mQ\in \mathbb{R}^{m\times m}$ and $\in \mathbb{R}^{n\times n}$ are orthogonal, then
$∣∣QAZ∣∣2=∣∣A∣∣2,∣∣QAZ∣∣F=∣∣A∣∣F.\begin{align*} & ||QAZ||_2 = ||A||_2,\\ & ||QAZ||_F = ||A||_F. \end{align*}$

Other Matrix Norm

Induced Norms (Operator Norm)

Suppose a vector norm $∣∣⋅∣∣α||\cdot||_{\alpha}$ on $Rn\mathbb{R}^n$ and a vector norm $∣∣⋅∣∣β||\cdot||_{\beta}$ on $Rm\mathbb{R}^m$ are given. A function $∣∣⋅∣∣(α,β):Rm×n→R||\cdot||_{(\alpha,\beta)}:\mathbb{R}^{m\times n}\rightarrow \mathbb{R}$ is a matrix norm induced by vector norm $∣∣⋅∣∣α||\cdot||_{\alpha}$ and $∣∣⋅∣∣β||\cdot||_{\beta}$ defined by
$∣∣A∣∣(α,β)=sup⁡∣∣x∣∣α=1∣∣Ax∣∣β.||A||_{(\alpha,\beta)} = \sup_{||x||_{\alpha}=1} ||Ax||_{\beta}.$
Remark.

when $α=β=p\alpha=\beta=p$ , it deduces matrix p-norm,
All operator norms possess the Consistency.

Schatten Norms

$∣∣A∣∣p=(∑i=1min⁡{m,n}σip)1p||A||_p = \left( \sum_{i=1}^{\min\{m,n\}} \sigma_i^p \right)^{\frac{1}{p}}$
Remark.

when $p = 2$ , it deduces the Frobenius norm,
All Schatten norms possess the Consistency,
All Schatten norms possess the Orthogonal Invriance.

“Entry-wise” Norms

$∣∣A∣∣p=(∑i=1m∑j=1n∣aij∣p)1p||A||_p= \left( \sum_{i=1}^{m} \sum_{j=1}^{n} |a_{ij}|^p \right)^{\frac{1}{p}}$
Remark. The special case $p = 2$ is the Frobenius norm, and $\infty$ yields $∞\infty$ -norm.

Max Norm

$A||_{max}=\max_{i,j}|a_{ij}|$