9 章
-
9.1:p≥1p\geq 1p≥1 时,闵可夫斯基距离满足距离度量的证明;
已知闵可夫斯基距离的定义: dist(xi,xj)=(∑u=1n∣xiu−xju∣p)1/p\text{dist}(\mathbf x_i,\mathbf x_j)=\left(\sum_{u=1}^n|x_{iu}-x_{ju}|^p\right)^{1/p}dist(xi,xj)=(∑u=1n∣xiu−xju∣p)1/p,易证非负,对称,主要证明其三角不等式特性:
证明如下:
根据闵可夫斯基不等式((∑u=1n∣xiu+xju∣p)1/p≤(∑u=1n∣xiu∣p)1/p+(∑u=1n∣xju∣p)1/p\left(\sum_{u=1}^n|x_{iu}+x_{ju}|^p\right)^{1/p}\leq \left(\sum_{u=1}^n|x_{iu}|^p\right)^{1/p}+\left(\sum_{u=1}^n|x_{ju}|^p\right)^{1/p}(∑u=1n∣xiu+xju∣p)1/p≤(∑u=1n∣xiu∣p)1/p+(∑u=1n∣xju∣p)1/p),则
dist(xi,xj)=(∑u=1n∣xiu−xju∣p)1/p=(∑u=1n∣xiu−xku+xku−xju∣p)1/p≤(∑u=1n∣xiu−xku∣p)1/p+(∑u=1n∣xku−xju∣p)1/p=dist(xi,xk)+dist(xk,xj) \begin{array}{ll} \text{dist}(\mathbf x_i,\mathbf x_j)&=\left(\sum_{u=1}^n|x_{iu}-x_{ju}|^p\right)^{1/p}\\ &=\left(\sum_{u=1}^n|x_{iu}-x_{ku}+x_{ku}-x_{ju}|^p\right)^{1/p}\\ &\leq\left(\sum_{u=1}^n|x_{iu}-x_{ku}|^p\right)^{1/p}+\left(\sum_{u=1}^n|x_{ku}-x_{ju}|^p\right)^{1/p}\\ &=\text{dist}(\mathbf x_i,\mathbf x_k)+\text{dist}(\mathbf x_k,\mathbf x_j) \end{array} dist(xi,xj)=(∑u=1n∣xiu−xju∣p)1/p=(∑u=1n∣xiu−xku+xku−xju∣p)1/p≤(∑u=1n∣xiu−xku∣p)1/p+(∑u=1n∣xku−xju∣p)1/p=dist(xi,xk)+dist(xk,xj)注闵可夫斯基不等式 p=2p=2p=2 时的证明:
(∑u=1n∣xiu+xju∣2)1/2≤(∑u=1n∣xiu∣2)1/2+(∑u=1n∣xju∣2)1/2 \left(\sum_{u=1}^n|x_{iu}+x_{ju}|^2\right)^{1/2}\leq \left(\sum_{u=1}^n|x_{iu}|^2\right)^{1/2}+\left(\sum_{u=1}^n|x_{ju}|^2\right)^{1/2} (u=1∑n∣xiu+xju∣2)1/2≤(u=1∑n∣xiu∣2)1/2+(u=1∑n∣xju∣2)1/2
两边同时平方得:
(∑u=1n∣xiu∣2)+(∑u=1n∣xju∣2)+2∑u∣xiu∣∣xju∣≤(∑u=1n∣xiu∣2)+(∑u=1n∣xju∣2)+2(∑u=1n∣xiu∣2)(∑u=1n∣xju∣2)
\left(\sum_{u=1}^n|x_{iu}|^2\right)+\left(\sum_{u=1}^n|x_{ju}|^2\right)+2\sum_u|x_{iu}||x_{ju}|\leq \left(\sum_{u=1}^n|x_{iu}|^2\right)+\left(\sum_{u=1}^n|x_{ju}|^2\right)+2\sqrt{\left(\sum_{u=1}^n|x_{iu}|^2\right)}\sqrt{\left(\sum_{u=1}^n|x_{ju}|^2\right)}
(u=1∑n∣xiu∣2)+(u=1∑n∣xju∣2)+2u∑∣xiu∣∣xju∣≤(u=1∑n∣xiu∣2)+(u=1∑n∣xju∣2)+2(u=1∑n∣xiu∣2)(u=1∑n∣xju∣2)
显然成立。
11章
-
11.8:xk+1=argminxL2∥x−z∥22+λ∥x∥1\mathbf x_{k+1}=\arg\min_{\mathbf x}\frac{L}{2}\|\mathbf x-\mathbf z\|_2^2+\lambda \|\mathbf x\|_1xk+1=argminx2L∥x−z∥22+λ∥x∥1
对等式右侧按分量展开得:L2∑i(xi−zi)2+λ∑i∣xi∣\frac{L}2\sum_i(x^i-z^i)^2+\lambda\sum_i|x^i|2L∑i(xi−zi)2+λ∑i∣xi∣(xix^ixi 表示 x\mathbf xx 的第 iii 个分量),其对 xix^ixi 的偏导为:
xi>0x^i>0xi>0 时,(xi−zi)⋅L+λ=0(x^i-z^i)\cdot L+\lambda=0(xi−zi)⋅L+λ=0,得 xi=−λL+zix^{i}=-\frac{\lambda}{L}+z^ixi=−Lλ+zi,结合条件 xi>0x^i>0xi>0才得到这样的结论,因此有:zi>λLz^i\gt \frac{\lambda }{L}zi>Lλ
同样地,xi<0x^i\lt 0xi<0 时,(xi−zi)⋅L−λ=0(x^i-z^i)\cdot L-\lambda=0(xi−zi)⋅L−λ=0,得 xi=λL+zix^{i}=\frac{\lambda}{L}+z^ixi=Lλ+zi,结合条件 xi<0x^i<0xi<0才得到这样的结论,因此有:zi<−λLz^i\lt -\frac{\lambda }{L}zi<−Lλ