Convex Optimization Note 1

最新推荐文章于 2024-10-14 20:33:05 发布

原创最新推荐文章于 2024-10-14 20:33:05 发布 · 676 阅读

0 ·

CC 4.0 BY-SA版权

凸优化专栏收录该内容

4 篇文章

订阅专栏

这篇笔记涵盖了《Convex Optimization》的部分内容，包括凸集的定义、性质、操作，以及数学背景，如范数和矩阵理论。此外，还讨论了凸函数的基础知识，如一阶和二阶条件，以及保持凸性的运算。文中通过多个例子和定理阐述了凸优化的基本概念。

Convex Optimization Note 1

本文是《Convex Optimization》ch.2\3 appendix A的笔记

1. Convex Set

1.1 Affine and convex sets:

1) $C=V+x_0=\lbrace x+x_0|x \in V\rbrace$ affine set 可以看做subspace在其中偏移一个点。类似于

A x = b ​

$Ax=b $ 的通解是nullspace的加上一个特解。

2) Affine dimension and relative interior 在affine hull的dimension与其上的interior

3) Convex combination 可以推广到infinite情况:

x \in C

$x\in C$ then

\int p (x) x d x \in C

$\int p(x) x dx \in C$

4) Cones $if \space x \in C , then \ \forall \theta >0,\theta x \in C$ –>convex cone

1.2 Some examples

1) Euclidean balls and ellipsoids:

{(x - x c) T P - 1 (x - x c)}

$\lbrace (x-x_c)^TP^{-1}(x-x_c) \rbrace$ 与

{x c + A u | ∥ u ∥ \leq 1}

$\lbrace x_c+Au | \parallel u \parallel \leq 1 \rbrace$

A = P 1 / 2

$A=P^{1/2}$

2) Norm cones $\{ (x,t)|\| x\|<t \} \subset R^{n+1}$

3) Polyhedra –> simplex (the convex hull of $k+1$ affinely independent points is $k$ -dimension simplex)

like unit simplex $(0,e_1,...,e_n)$ and probability simplex $(e_1,...,e_n)$

Polyhedra可以有两种表示方法: Convex hull 或 Inequality

4) Positive semi-definite cone $S^n_{+}$

1.3 Operation that preserve convexity

1) Intersection

Positive semi-definite Cone $S^n_+=\bigcap_{z\ne 0}\{X\in S^n|z^TXz\geq 0\}$

$S = \{x ∈ R^m |\ |p(t)| ≤ 1\ for |t| ≤ π/3\}$ and $p(t)=\sum_{k=1}^mx_k\cos kt$

所有的convex set可以表达为infinite个halfspace的交集

2) Affine function 仿射函数或其逆函数均不改变convexity

Polyhedra

Solution set of linear matrix inequality $A(x)=x_1A_1+...+x_nA_n \preceq B$

Hyperbolic cone $\{x|x^TPx\le (c^Tx)^2,c^Tx\ge 0\}$ is inverse image of $\{(x,t)|x^Tx\le t^2,t\ge 0\}$

3) Perspective functions

$P(z,t)=z/t$ 其中 $dom\ P=R^n\times R_{++}$ 这种函数(或其逆函数)可以保持凸性

Conditional probability: 原始probability位于probability simplex上，condition只是除以部分的和，可以看作linear-fractional function，因此conditional prob也是convex set

1.4 Separating and supporting theorem

1) Separating theorem: 任意两个不相交的凸集可以用hyperplane分开。

证明为找两个凸集的最近点连线的中点，过中点并且垂直于连线的hyperplane，两个集合必定会将其分开。反证其不能分开（ $Ax+b$ 符号不对）则可以在凸集中找到一个更近的点（正好是欧氏距离的导数）。

2）Strict separating:

两个凸集不一定strict separating

一个closed convex set与一个点可以strict separating，表明所有closed convex set是所有包含它的half-space的交集。

3）inverse: 对于两个凸集，如果有一个是开集，则如果它们存在separating hyperplane，那么它们disjoint

4）supporting theorem可以由 $intC$ 与 $P$ 的separating来证明

2. Mathematical background (Appendix A)

1) norm
Vector norm: P-quadratic: $\|x\|_p=(x^TPx)^{1/2}$
Matrix norm:
sum-absolute/maximum-absolute
operator norms $\|X\|_{a,b}=sup\{\|Xu\|_a\ |\ \|u\|_b \le 1\}$
由operator产生的： $l_2$ 产生spectral norm为最大的奇异值， $l_1$ 得到max-column-sum, $l_\infty$ 得到max-row-sum

2) equivalence of norm: 所有 $R^n$ 上的norm与某个quadratic norm等价，满足 $\|x\|_P\le \|x\|\le \sqrt{n}\|x\|_P$

3) Dual norm:
$z^Tx\le \|x\|\|z\|_*$
L2-norm与自身dual，L1与L $\infty$ dual，Lp与Lq dual( $1/p+1/q=1$ )

4) close/open set and boundary definition

5) closed function: sublevel set $\{x\in dom f|f(x)\le \alpha\}$ all are closed set
如果 f 连续，dom f 是闭集，则f closed
如果 f 连续，dom f 是开集，则f 在端点上需要趋近于 $\infty$ 才能让f closed

6) $\log\det(I+X^{-1/2}\Delta XX^{1/2})=\sum^n_{i=1}(1+\lambda_i)$ 其中 $\lambda_i$ 是 $X^{-1/2}\Delta XX^{1/2}$ 的特征值
$\nabla\log\det(X)=X^{-1}$

7) $cond(A)=\|A\|_2\|A^{-1}\|_2=\sigma_{max}(A)/\sigma_{min}(A)$

8) pseudo inverse :
$A^\dagger b$ 是 $minimize\ \|Ax-b\|^2_2$ 的解
generalized quadratic function minima

9) Schur complement $X=\begin{pmatrix} A&B\\ B^T&C \end{pmatrix}$ $S=C-B^TA^{-1}B$
$\det X=\det A \det S$
inverse 可以分解为S的逆
$\inf _u \begin{pmatrix} u&v \end{pmatrix}\begin{pmatrix} A&B\\ B^T&C \end{pmatrix}\begin{pmatrix} u\\v \end{pmatrix}=v^TSv$
X的正定<–>A与S正定，X正定A正定<–>S正定
当A为singular时，Schur补可以由A的pseudo inverse来表示

3. Convex function

3.1 basics

1) restrict to line convex/ extended value function

2) 1st order condition: $f(y)\ge f(x)+\nabla f(x)^T(y-x)$

3) 2nd order condition: $\nabla^2f(x)\succeq 0$

4) sublevel sets of convex functions are convex sets, converse is not true.

5) Epigraph is convex $\Leftrightarrow$ function is convex
Epigraph在 $(x,f(x))$ 的supporting plane法向为 $(\nabla f(x),-1)$

6) $f(\theta x+(1-\theta)y)\le \theta f(x)+(1-\theta)f(y)$ 推广 $f(Ex)\le Ef(x)$ 可以称为Jensen’s Inequality
可以用它证明:
$\sqrt{ab}\le(a+b)/2$
Holder inequality $\sum_{i=1}^nx_iy_i\le (\sum_{i=1}^n|x_i|^p)^{1/p}(\sum_{i=1}^n|y_i|^q)^{1/q}$ 其中（1/p+1/q=1)

7) examples
$f(x)=x^2/y\ \ with \ y>0$
log-sum-exp function $f(x)=log(e^{x_1}+...+e^{x_n})$ 求二阶导数，用Cauthy 不等式可得
geometric mean $f(x)=(\prod_{i=1}^nx_i)^{1/n}$ 同求二阶导，用Cauthy不等式得concave
log-determinant $f(X)=\log\det X\ \ dom f=S_{++}^n$ 限制到直线上，求导可得

3.2 operations that preserve convexity

1) nonnegative weighted sum –>推广到无限sum

2) affine mapping f(Ax+b)

3) point-wise max $f(x)=max(f_1(x),...,f_n(x))$ –> infinite set $g(x)=\sup_{y}f(x,y)$ 给定y，所有的f(x)都是凸函数
sum of r largest component
supporting function of a set(任意集合) $f(x)=\sup\{x^Ty|y\in C\}$
distance to the farthest point of a set $f(x)=sup_{y\in C}\|x-y\|$
maximum eigenvalue of a symmetric matrix $f(X)=sup\{y^TXy|\|y\|_2=1\}$
operator norm见2. background
所有凸函数都是所有affine under-estimator 函数的supremum(每一点都取supporting plane)

4)Composition
从求二次导数的式子可以得到。 $h^{\prime\prime}(g(x))=h^{\prime\prime}(g(x))g^{\prime}(x)^2+h^\prime(g(x))g^{\prime\prime}(x)$
推广后并不需要二次可导，只需要h在其extended value function上是nondecreasing或者nonincreasing即可。
这种extended value上限制了h定义域的范围，一定会包括( $\infty$ )

5) Minimization: $f$ is convex in $(x,y)$ , and $C$ is convex non-empty set, $g(x)=inf_{y\in C}f(x,y)$ is convex
distance to a convex set
$g(x)=inf\{h(y)|Ay=x\}$

6) Perspective of a function: $g(x,t)=tf(x/t)$ 可以由epigraph证明
$g(x,t)=x^Tx/t$
$g(x,t)=-t\log(x/t)=t\log t-t\log x$

3.3 conjugate function

1) $f^*(y)=sup_{x\in dom f}(y^T-f(x))$

2) Affine: $-b$
Negative logarithm: $-log(-y)-1 \ y<0$
Exponential: $y\log y-y\ with \ y\ge 0$
Negative entropy: $e^{y-1}\ y\in R$
Inverse: $-2(-y)^{1/2}\ y\le 0$
Strictly convex quadratic function: $f(x)=\frac{1}{2}x^TQx \ with \ Q\succ 0$ $f^*(y)=\frac{1}{2}y^TQ^{-1}y$
Log-determinant: $f^*(Y)=\log\det(-Y)^{-1}-n$
Indicator function: supporting function