标签:red 必须 linear \n rspec inf where normal sqrt
目录
M is positive semidefinite matrix \(\iff\) all principal submatrices \(P\) of \(M\) are PSD
Note: This follows by considering the quadratic form \(x^T Mx\) and looking at the components of \(x\) corresponding to the defining subset of principal submatrix. The converse is trivially true.
M is PSD \(\iff\) all principal minors are non-negative (所有主子式非负)
将M写成二次型:
\[
x^T M x = \sum_{i,j}M_{ij}x_ix_j
\]
于是取 \(x\) 为标准基 \(e_i ~\implies M_{ii} \ge 0 \implies \mathbf{tr}(M) \ge 0\) , 再取\(x\)为零向量只有 i,j两个位置为 1,则
\[
\begin{gathered}
x^T M x = M_{ii}M_{jj} - M_{ij}^2 \ge 0 ~~(PSD) \\implies M_{ij} \le \sqrt{M_{ii}M_{jj}} \le \frac{M_{ii} + M_{jj}}{2}
\end{gathered}
\]
General definition of a norm:
Matrix norm:
Spectrial radius
Two ==equivalent== ways to represent a convex set:
A closed convex set \(S\) is the intersection of all closed halfspaces \(H\) containing it.
Polar
Let \(S \subseteq \mathbb{R}^n\) be a convex set containing the origin. The polar of \(S\) is defined as follows
\[
S^{\circ} := \{y ~|~ y^Tx \le 1, ~\forall x \in S\}
\]
Note
Properties of the polar:
Polar duality of convex cones
Notes
Conjugation of convex functions
Let \(f: \mathbb{R}^n \mapsto \mathbb{R}\cup\{\infty\}\) be a convex function. The ==conjugation== of \(f\) is
\[
f^*(y) := \sup_\limits{x}(y^Tx - f(x))
\]
Properties of the conjugate
affine is convex: \(f(x) = a^T x+b\)
affine 既凸也凹
任何_范数_是凸的
Proof: let \(\pi(x)\) be a norm of \(x\), then
\(f\) is convex \(\iff\) epi(\(f\)) is convex
A convex function \(f\) is called closed if its epigraph is a closed set.
Corollary:
pf: \(f(x) = f(\sum\alpha_i x_i) \le \sum \alpha_i f(x_i) \le \max_\limits{i} f(x_i)\)
Note: the convexity of level sets does not characterize convex functions, but quasiconvex functions.
Some convex sets
椭球(\(\{x | (x-a)^T Q (x-a) \le r^2\}\)) is convex and closed
pf: \(x^TQy := \langle x, y \rangle\) 满足内积的三条性质
- bilinearity
- symmetry
- positivity
上述三条性质 \(\iff\) Q is PSD
\(\epsilon\)-neighborhood:
Necessary and Sufficient Convexity Condition for smooth function:
subgradient property is characteristic of convex functions:
Examples
凸函数的局部最优等价于全局最优。
第一充要条件(凸函数)
\(x^* \in \mathbf{dom}f?\) is the minimizer \(\iff?\) \(0 \in \partial f(x^*)?\)
A differentiable function f is strongly convex if
\[
f(y) \ge f(x) + \nabla f(x)^T(y-x) + \frac{\mu}{2} \|y-x\|^2
\]
Note
Note: Intuitively speaking, strong convexity means that there exists a quartic lower bound on the growth of the function.
Equivalent definition
\[
\begin{align}
&(i)~f(y)\ge f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\lVert y-x \rVert^2,~\forall x, y.\ &(ii)~g(x) = f(x)-\frac{\mu}{2}\lVert x \rVert^2~\text{is convex},~\forall x.\ &(iii)~\langle \nabla f(x) - \nabla f(y),x-y \rangle \ge \mu \lVert x-y\rVert^2,~\forall x, y.\ &(iv)~f(\alpha x+ (1-\alpha) y) \le \alpha f(x) + (1-\alpha) f(y) - \frac{\alpha (1-\alpha)\mu}{2}\Vert x-y\rVert^2,~\alpha \in [0,1].\ &(v)~\nabla^2 f(x) \succeq \mu \boldsymbol{I}
\end{align}
\]
Consider an optimization problem in standard form (not necessarily convex)
\[
\begin{array}{ll}
\underset{x}{\text{minimize}} & f_0(x) \\text{subject to} & f_i(x) \le 0, ~i=1,\cdots,m \~ & h_i(x) = 0, ~i=1,\cdots,p
\end{array}
\]
The Lagrangian is
\[
L(x,\boldsymbol{\lambda},\boldsymbol{\mu}) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p \mu_i h_i(x)
\]
The Lagrange dual function is defined as
\[
g(\lambda, \mu) = \inf_{x} L(x,\lambda,\mu)
\]
Lagrange dual problem
\[
\begin{array}{ll}
\underset{\lambda, \mu}{\text{maximize}} & g(\lambda, \mu) \\text{subject to} & \boldsymbol{\lambda} \succeq \mathbf{0}
\end{array}
\]
Weak duality
\[
d^* \le p^*
\]
Strong dualiy
\[
d^* = p^*
\]
Complementary slackness
KKT conditions
Tagent cone
Let M be a (nonempty) convex set and \(x^* \in M\), the tagent cone of \(M\) at \(x^*\) is the cone
\[
\begin{split}
T_M(x^*) &= \{h \in \mathbb{R}^n | x^* + th \in M, \; \forall t > 0 \} \&= \{y \in \mathbb{R}^n ~|~ y - x^* \in M\}
\end{split}
\]
Note:
e.g. 多面体
\[
M = \{x | Ax \le b\} = \{x | a_i^Tx \le b_i, \; i = 1,\dots,m\}
\]
the tangent cone at \(x^*\) is
\[
T_M(x^*) = \{h~|~a_i^T h \le 0, ~\forall i, ~a_i^T x^* = b_i\}
\]
Normal cone: the polar cone of tangent cone
\[
N_M(x^*) = \{g \in \mathbb{R}^n ~|~ \langle g, y-x^*\rangle \le 0, ~\forall y \in M\}
\]
Note:
~ | Stepsize Rule | Convergence Rate | Iteration Complexity |
---|---|---|---|
Gradient descent | |||
strongly convex & smooth | \(\eta_t = \frac{2}{\mu + L}\) | \(O\left(\frac{\kappa -1}{\kappa +1}\right)^t\) | \(O\left(\frac{\log\frac{1}{\epsilon}}{\log\frac{\kappa+1}{\kappa-1}}\right)\) |
convex & smooth | \(\eta_t = \frac{1}{L}\) | \(O(\frac{1}{\sqrt{t}})\) | \(O(\frac{1}{\epsilon})\) |
Frank-Wolfe | |||
(strongly) convex & smooth | \(\eta_t = \frac{1}{t}\) | \(O(\frac{1}{\sqrt{t}})\) | \(O(\frac{1}{\epsilon})\) |
Projected GD | |||
convex & smooth | \(\eta_t = \frac{1}{L}\) | \(O(\frac{1}{\sqrt{t}})\) | \(O(\frac{1}{\epsilon})\) |
strongly convex & smooth | \(\eta_t = \frac{1}{L}\) | \(O\left((1-\frac{1}{\kappa})^t\right)\) | \(O(\kappa\log\frac{1}{\epsilon})\) |
Subgradient method | |||
convex & Lipschitz | \(\eta_t = \frac{1}{\sqrt{t}}\) | \(O(\frac{1}{\sqrt{t}})\) | \(O(\frac{1}{\epsilon^2})\) |
strongly convex & Lipschitz | \(\eta_t = \frac{1}{t}\) | \(O\left(\frac{1}{t}\right)\) | \(O(\frac{1}{\epsilon})\) |
Proximal GD | |||
convex & smooth (w.r.t. \(f\)) | \(\eta_t = \frac{1}{L}\) | \(O(\frac{1}{t})\) | \(O(\frac{1}{\epsilon})\) |
strongly convex & smooth (w.r.t. \(f\)) | \(\eta_t = \frac{1}{L}\) | \(O\left((1-\frac{1}{\kappa})^t\right)\) | \(O(\kappa\log\frac{1}{\epsilon})\) |
标签:red 必须 linear \n rspec inf where normal sqrt
原文地址:https://www.cnblogs.com/yychi/p/9398439.html