Machine Learning Assignment 1 (Linear Algebra) Instructor: Beilun Wang Name:Daiyang Luan ID:61518421\begin{array}{|l|}
\hline \text { Machine Learning } \\\\
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \textbf { Assignment 1 (Linear Algebra) }\\\\
\text {Instructor: Beilun Wang }\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{Name:Daiyang Luan\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{ID:61518421}}\\\\
\hline
\end{array} Machine Learning Assignment 1 (Linear Algebra) Instructor: Beilun Wang Name:Daiyang Luan ID:61518421
Problem 1
Let two vectors a=(1,2,3)Ta=(1,2,3)^{\mathrm{T}}a=(1,2,3)T and b=(−8,1,2)Tb=(-8,1,2)^{\mathrm{T}}b=(−8,1,2)T.Answer the following equations:
(1) Compute the ℓ2\ell_{2}ℓ2 norm of aaa and bbb
(2) Calculate the Euclidean distance between aaa and bbb
(3) Are aaa and bbb orthogonal?
Solution:
(1)The ℓ2\ell_{2}ℓ2 norm of aaa is 14\sqrt{14}14 and the ℓ2\ell_{2}ℓ2 norm of bbb is 69\sqrt{69}69.
(2)The Euclidean distance between aaa and bbb is 83\sqrt{83}83.
(3)As aTb=1×(−8)+2×1+3×2=0a^{\mathrm{T}}b=1\times (-8)+2\times 1+3\times 2=0aTb=1×(−8)+2×1+3×2=0, aaa and bbb is orthogonal.
Problem 2
Suppose A=[1−333−536−64]A=\left[\begin{array}{ccc}{1} & {-3} & {3} \\ {3} & {-5} & {3} \\ {6} & {-6} & {4}\end{array}\right]A=⎣⎡136−3−5−6334⎦⎤, answer the following questions:
(1) Calculate A−1A^{-1}A−1 and det(A)\operatorname{det}(A)det(A).
(2) The Rank of AAA is?
(3) The trace of AAA is?
(4) Calculate A+ATA+A^{T}A+AT
(5) Is AAA an orthogonal matrix? State your reason.
(6) Calculate all the eigenvalue λ\lambdaλ and corresponding eigenvectors of AAA.
(7) Diagonalize the matrix AAA.
(8) Calculate the ℓ2,1\ell_{2,1}ℓ2,1 norm ∥A∥2,1\|A\|_{2,1}∥A∥2,1 and the Frobenius norm (i.e. ℓ2\ell_{2}ℓ2 norm) ∥A∥F\|A\|_{F}∥A∥F
(9) Calculate the nuclear norm ∥A∥∗\|A\|_*∥A∥∗ and the spectral norm ∥A∥2\|A\|_{2}∥A∥2
Solution:
(1)[AI]=[1−331003−530106−64001]⟶row[100−1/8−3/83/80103/8−7/83/80013/4−3/41/4]=[IA−1]\left[\begin{array}{ccc} A &I\end{array}\right]=\left[\begin{array}{ccc}1&-3&3&1&0&0\\3&-5&3&0&1&0 \\6&-6&4&0&0&1\end{array}\right]\stackrel{row }{\longrightarrow}\left[\begin{array}{ccc}1&0&0&-1/8&-3/8&3/8\\0&1&0&3/8&-7/8&3/8 \\0&0&1&3/4&-3/4&1/4\end{array}\right]=\left[\begin{array}{ccc} I &A^{-1}\end{array}\right][AI]=⎣⎡136−3−5−6334100010001⎦⎤⟶row⎣⎡100010001−1/83/83/4−3/8−7/8−3/43/83/81/4⎦⎤=[IA−1]
Hence,A−1=[−1/8−3/83/83/8−7/83/83/4−3/41/4]A^{-1}=\left[\begin{array}{ccc}-1/8&-3/8&3/8\\3/8&-7/8&3/8 \\3/4&-3/4&1/4\end{array}\right]A−1=⎣⎡−1/83/83/4−3/8−7/8−3/43/83/81/4⎦⎤
det(A)=∣1−333−536−64∣=∣1−3304−6004∣=16det(A)=
\left|\begin{array}{cccc}
1 & -3 & 3 \\
3 & -5 & 3\\
6 & -6 & 4
\end{array}\right| =\left|\begin{array}{cccc}
1 & -3 & 3 \\
0 & 4 & -6\\
0 & 0 & 4
\end{array}\right|=16det(A)=∣∣∣∣∣∣136−3−5−6334∣∣∣∣∣∣=∣∣∣∣∣∣100−3403−64∣∣∣∣∣∣=16
(2)As det(A)≠0det(A)\not=0det(A)=0, AAA is a full-rank matrix. Thus, the rank of AAA is 333.
(3)tr(A)=1+(−5)+4=0tr(A)=1+(-5)+4=0tr(A)=1+(−5)+4=0. That is, the trace of AAA is 000.
(4)A+AT=[1−333−536−64]+[136−3−5−6334]=[2090−10−39−38]A+A^{T}=\left[\begin{array}{ccc}1&-3&3\\3&-5&3\\6&-6&4\end{array}\right]+\left[\begin{array}{ccc}1&3&6\\-3&-5&-6\\3&3&4\end{array}\right]=\left[\begin{array}{ccc}2&0&9\\0&-10&-3\\9&-3&8\end{array}\right]A+AT=⎣⎡136−3−5−6334⎦⎤+⎣⎡1−333−536−64⎦⎤=⎣⎡2090−10−39−38⎦⎤
(5)ATA=[46−5436−5470−4836−4834]≠IA^{T}A=\left[\begin{array}{ccc}46&-54&36\\-54&70&-48\\36&-48&34\end{array}\right]\not=IATA=⎣⎡46−5436−5470−4836−4834⎦⎤=I, so AAA is not an orthogonal matrix.
(6)The characteristic determinant of AAA is ∣λ−13−3−3λ+5−3−66λ−4∣=(λ+2)2(λ−4).\left|\begin{array}{cccc} \lambda-1 & 3 & -3 \\ -3 & \lambda+5 & -3\\ -6 & 6 & \lambda-4 \end{array}\right|=(\lambda+2)^{2}(\lambda-4).∣∣∣∣∣∣λ−1−3−63λ+56−3−3λ−4∣∣∣∣∣∣=(λ+2)2(λ−4). Thus, all the eigenvalues of AAA are λ1=λ2=−2,λ3=4.\lambda_{1}=\lambda_{2}=-2,\lambda_{3}=4.λ1=λ2=−2,λ3=4. Let Aαi=λiαi,i=1,2,3A\alpha_{i}=\lambda_{i}\alpha_{i},i=1,2,3Aαi=λiαi,i=1,2,3. Then we have α1=[110],α2=[011],α3=[112]\alpha_{1}=\left[\begin{array}{ccc}1\\1\\0\end{array}\right],\alpha_{2}=\left[\begin{array}{ccc}0\\1\\1\end{array}\right],\alpha_{3}=\left[\begin{array}{ccc}1\\1\\2\end{array}\right]α1=⎣⎡110⎦⎤,α2=⎣⎡011⎦⎤,α3=⎣⎡112⎦⎤. αi(i=1,2,3)\alpha_{i}(i=1,2,3)αi(i=1,2,3) are the corresponding eigenvectors.
(7)The diagonal matrix corresponding to matrix AAA is [−2000−20004]\left[\begin{array}{cccc} -2 & 0 & 0 \\ 0 & -2 & 0\\ 0 &0 & 4 \end{array}\right]⎣⎡−2000−20004⎦⎤
(8)In order to calculate the ℓ2,1\ell_{2,1}ℓ2,1 norm ∥A∥2,1\|A\|_{2,1}∥A∥2,1, we first calculate the 2-norm of each row:19,43,222\sqrt{19},\sqrt{43},2\sqrt{22}19,43,222. Thus, ∥A∥2,1=19+43+222\|A\|_{2,1}=\sqrt{19}+\sqrt{43}+2\sqrt{22}∥A∥2,1=19+43+222.
∥A∥F=(∑i=1m∑j=1n(aij)2)12=1+9+9+9+25+9+36+36+16=150.\Vert A \Vert_F=\left({\sum\limits_{i=1}^{m}{\sum\limits_{j=1}^n{(a_{ij})^2}}}\right)^{{\frac{1}{2}}}=\sqrt{1+9+9+9+25+9+36+36+16}=\sqrt{150}.∥A∥F=(i=1∑mj=1∑n(aij)2)21=1+9+9+9+25+9+36+36+16=150.
(9)The nuclear norm ∥A∥∗\|A\|_*∥A∥∗ is defined as the sum of all the singular values of matrix AAA. As is calculated above, ATA=[46−5436−5470−4836−4834]A^{T}A=\left[\begin{array}{ccc}46&-54&36\\-54&70&-48\\36&-48&34\end{array}\right]ATA=⎣⎡46−5436−5470−4836−4834⎦⎤. Supposing the eigenvalues of ATAA^TAATA are λi,i=1,2,3\lambda_i, i=1,2,3λi,i=1,2,3, we have ∣λI−A∣=0|\lambda I-A|=0∣λI−A∣=0.
That is,
∣λ−4654−3654λ−7048−3648λ−34∣=0
\left|{\begin{array}{l}
\lambda-46&54&-36\\
54&\lambda-70&48\\
-36&48&\lambda-34
\end{array}}\right|=0
∣∣∣∣∣∣λ−4654−3654λ−7048−3648λ−34∣∣∣∣∣∣=0
Hence, we have λ3−150λ2+648λ−256=0\lambda^3-150\lambda^2+648\lambda-256=0λ3−150λ2+648λ−256=0
The solution of the equation is:
λ1=4\lambda_1=4λ1=4λ2=73+965\lambda_2=73+9\sqrt{65}λ2=73+965λ3=73−965\lambda_3=73-9\sqrt{65}λ3=73−965
Thus, ∥A∥∗=2+73+965+73−965≈14.727922061357859\|A\|_*=2+\sqrt{73+9\sqrt{65}}+\sqrt{73-9\sqrt{65}}\approx14.727922061357859∥A∥∗=2+73+965+73−965≈14.727922061357859.
∥A∥2=max(ATA)=73+965≈12.064838156174618\|A\|_2=\sqrt{max(A^TA})=\sqrt{73+9\sqrt{65}}\approx 12.064838156174618∥A∥2=max(ATA)=73+965≈12.064838156174618
Problem 3
Please give some proper steps to show how you get the answer. Let x=(x1,x2,x3)Tx=\left(x_{1}, x_{2}, x_{3}\right)^{T}x=(x1,x2,x3)T and
{2x1+2x2+3x3=1x1−x2=−1−x1+2x2+x3=2
\left\{\begin{array}{l}
2 x_{1}+2 x_{2}+3 x_{3}=1 \\
x_{1}-x_{2}=-1 \\
-x_{1}+2 x_{2}+x_{3}=2
\end{array}\right.
⎩⎨⎧2x1+2x2+3x3=1x1−x2=−1−x1+2x2+x3=2
Answer the following questions:
(1) Solve the linear equations
(2) Write it into matrix form(i.e. Ax=bA x=bAx=b ) and we will use the same AAA and bbb in the following questions.
(3) The Rank of AAA is?
(4) Calculate A−1A^{-1}A−1 and det(A)\operatorname{det}(A)det(A)
(5) Use (4) to solve the linear equations
(6) Calculate the inner product and outer product of xxx and bbb.(i.e. ⟨x,b⟩\langle x, b\rangle⟨x,b⟩ and x⊗bx \otimes bx⊗b )
(7) Calculate the ℓ1,ℓ2\ell_{1}, \ell_{2}ℓ1,ℓ2 and ℓ∞\ell_{\infty}ℓ∞ norm of bbb
(8) Suppose y=(y1,y2,y3),y=\left(y_{1}, y_{2}, y_{3}\right),y=(y1,y2,y3), calculate yTAy,∇yyTAyy^{T} A y, \nabla_{y} y^{T} A yyTAy,∇yyTAy
(9) We add one linear equation −x1+2x2+x3=2-x_{1}+2 x_{2}+x_{3}=2−x1+2x2+x3=2 into linear equations above. Write it into matrix form(i.e. A1x=b)\left.A_{1} x=b\right)A1x=b)
(10) The rank of A1A_{1}A1 is?
(11) Could these linear equations A1x=bA_{1} x=bA1x=b be solved? State reasons.
Solution:
(1)Solving the linear equations, we have: x1=−1,x2=0,x3=1x_1=-1, x_2=0, x_3=1x1=−1,x2=0,x3=1.
(2)The linear equation can be written into matrix form Ax=bAx=bAx=b where
A=[2231−10−121]A=\left[\begin{array}{l}
2&2&3 \\
1&-1&0 \\
-1&2&1
\end{array}\right]A=⎣⎡21−12−12301⎦⎤
and
b=[1−12]b=\left[\begin{array}{l}
1\\-1\\2
\end{array}\right]b=⎣⎡1−12⎦⎤
(3)The rank of AAA is 3.
(4)A−1=[1−4−31−5−3−164]A^{-1}=\left[\begin{array}{l}
1&-4&-3 \\
1&-5&-3 \\
-1&6&4
\end{array}\right]A−1=⎣⎡11−1−4−56−3−34⎦⎤
det(A)=−1.det(A)=-1.det(A)=−1.
(5)x=A−1b=[1−4−31−5−3−164][1−12]=[−101]x=A^{-1}b=\left[\begin{array}{l}
1&-4&-3 \\
1&-5&-3 \\
-1&6&4
\end{array}\right]\left[\begin{array}{l}
1\\-1\\2
\end{array}\right]=\left[\begin{array}{l}
-1\\0\\1
\end{array}\right]x=A−1b=⎣⎡11−1−4−56−3−34⎦⎤⎣⎡1−12⎦⎤=⎣⎡−101⎦⎤
That is, x1=−1,x2=0,x3=1x_1=-1, x_2=0, x_3=1x1=−1,x2=0,x3=1, which is consistent with the result of question1.
(6)<x,b>=1,x⨂b=[131]T<x,b>=1,x\bigotimes b=\left[\begin{array}{l} 1&3&1 \end{array}\right]^T<x,b>=1,x⨂b=[131]T
(7)The ℓ1\ell_1ℓ1 norm of bbb is ∥b∥1=1+1+2=4\|b\|_1=1+1+2=4∥b∥1=1+1+2=4.
The ℓ2\ell_2ℓ2 norm of bbb is ∥b∥2=1+1+4=6\|b\|_2=\sqrt{1+1+4}=\sqrt{6}∥b∥2=1+1+4=6.
The ℓ∞\ell_\inftyℓ∞ norm of bbb is ∥b∥∞=max(1,1,2)=2\|b\|_\infty=max(1,1,2)=2∥b∥∞=max(1,1,2)=2.
(8)yTAy=[y1y2y3][2231−10−121][y1y2y3]=2y12−y22+y32+3y1y2+2y2y3+2y1y3y^TAy=\left[\begin{array}{l}
y_1&y_2&y_3
\end{array}\right]\left[\begin{array}{l}
2&2&3 \\
1&-1&0 \\
-1&2&1
\end{array}\right]\left[\begin{array}{l}
y_1\\y_2\\y_3
\end{array}\right]=2y_1^2-y_2^2+y_3^2+3y_1y_2+2y_2y_3+2y_1y_3yTAy=[y1y2y3]⎣⎡21−12−12301⎦⎤⎣⎡y1y2y3⎦⎤=2y12−y22+y32+3y1y2+2y2y3+2y1y3
∇yyTAy=[4y1+3y2+2y33y1−2y2+2y32y1+2y2+2y3]\nabla_yy^TAy=\left[\begin{array}{l}
4y_1+3y_2+2y_3\\3y_1-2y_2+2y_3\\2y_1+2y_2+2y_3
\end{array}\right]∇yyTAy=⎣⎡4y1+3y2+2y33y1−2y2+2y32y1+2y2+2y3⎦⎤
(9)The new linear equation can be written into matrix form A1x=b1A_1x=b_1A1x=b1 where
A1=[2231−10−121−121]A_1=\left[\begin{array}{l}
2&2&3 \\
1&-1&0 \\
-1&2&1\\-1&2&1
\end{array}\right]A1=⎣⎢⎢⎡21−1−12−1223011⎦⎥⎥⎤
and
b1=[1−122]b_1=\left[\begin{array}{l}
1\\-1\\2\\2
\end{array}\right]b1=⎣⎢⎢⎡1−122⎦⎥⎥⎤
(10)The rank of A1A_1A1 is 3.
(11)Yes.
The number of variables is the same as the rank of the new matrix A1A_1A1 and thus there is no more than one solution to the non homogeneous linear equations. Moreover, after diagonalizing the matrix AAA, we can see that after deleting the row whose elements are all zero, determinant of the new matrix is not zero. This indicates that a solution exists for these linear equations.
本文探讨了线性代数在机器学习任务中的关键作用,通过具体实例讲解了向量和矩阵运算,包括求范数、计算距离、判断正交性等。同时,深入解析了矩阵逆、行列式、秩、迹、转置、正交性、特征值与特征向量等概念,并介绍了如何对矩阵进行对角化,以及计算不同类型的矩阵范数。
568

被折叠的 条评论
为什么被折叠?



