标签:pdf get 优化 containe end eee log ann rac
Gowal S., Dvijotham K., Stanforth R., Bunel R., Qin C., Uesato J., Arandjelovic R., Mann T. & Kohli P. Scalable verified training for provably robust image classification. In IEEE International Conference on Computer Vision (ICCV), 2019.
概
一种可验证的提高网络鲁棒性的方法.
主要内容
假设第k层表示为:
\[z_k = h_k(z_{k-1}) = \sigma_k (W_k z_{k-1} + b_k), \: k=1,2,\cdots, K.
\]
一般的分类网络, 要求:
\[e_y^Tz_K \ge e_j z_K, \: \forall j \not= y,
\]
其中\(e_j\)只有第\(j\)个元素为1, 其余均为0.
相应的, 如果考虑鲁棒性, 则需要
\[\forall z_0 \in \mathcal{X}(x_0):= \{x| \|x-x_0\|_{\infty} < \epsilon \},
\]
\[[z_K]_y = e_y^Tz_K \ge e_j^T z_K = [z_K]_j, \: \forall j \not= y,
\]
成立.
现在, 假设已知
\[\underline{z}_{k-1} \le z_{k-1} \le \overline{z}_{k-1},
\]
并令
\[\mu_{k-1} = \frac{\overline{z}_{k-1} + \underline{z}_{k-1}}{2} \r_{k-1} = \frac{\overline{z}_{k-1} - \underline{z}_{k-1}}{2} \\mu_k = W_k \mu_{k-1} + b \r_k = |W_k|r_{k-1} \\]
则
\[\mu_k - r_k \le W_k z_{k-1} + b \le \mu_k + r_k,
\]
其中\(|\cdot|\)是element-wise的绝对值.
注:
\[\max_{x \in [\underline{x}, \overline{x}]} \quad a \cdot x
\Rightarrow x =
\left \{
\begin{array}{ll}
\overline{x}, & a \ge 0 \\underline{x}, & a < 0
\end{array}
\right .
\Rightarrow
a \cdot x = \frac{a(\overline{x}+\underline{x})}{2} + \frac{|a|(\overline{x}-\underline{x})}{2}.
\]
倘若激活函数\(\sigma\)是单调的, 进而有
\[\underline{z}_k = \sigma_k(\mu_k - r_k), \: \overline{z}_k = \sigma_k (\mu_k + r_k) \\underline{z}_k \le z_k \le \overline{z}_k.
\]
以此类推, 我们可以获得:
\[\underline{z}_K \le z_K \le \overline{z}_K.
\]
可知:
\[[z_K]_j - [z_K]_y \le [\overline{z}_K]_j - [\underline{z}_K]_y,
\]
显然, 如果下式
\[[\overline{z}_K]_j - [\underline{z}_K]_y \le 0, \: \forall j\not= y
\]
成立, 则该模型对于\(x_0\)在\(\epsilon\)下就是鲁棒的.
故定义
\[[z_{*}]_j =
\left \{
\begin{array}{ll}
[\overline{z}_K]_j, & j \not = y \[\underline{z}_K]_j, & j \not = y \\end{array}
\right.
\]
并通过下列损失:
\[\mathcal{L} = \kappa \cdot \ell (z_K, y) + (1 - \kappa) \cdot \ell(z_*, y),
\]
其中\(\ell\)是softmax交叉熵损失.
注:
\[\max_{\|x - x_0\|_{\infty}<\epsilon} \ell (z_K(x), y) \le \ell(z_*, y).
\]
这是因为
\[\begin{array}{ll}
\ell(z_K(x), y)
&= -\log \frac{\exp (z_y)}{\sum_j \exp(z_j)} \&= -\log \frac{1}{\sum_j \exp(z_j-z_y)} \&\le -\log \frac{1}{\sum_{j\not= y} \exp(\overline{z}_j-\underline{z}_y) + 1} \&\le -\log \frac{\exp(\underline{z}_y)}{\sum_{j\not =y} \exp(\overline{z}_j) + \exp(\underline{z}_y)} \&= \ell(z_*, y).
\end{array}
\]
故我们优化的是一个上界.
代码
原文代码
Interval Bound Propagation (IBP)
标签:pdf get 优化 containe end eee log ann rac
原文地址:https://www.cnblogs.com/MTandHJ/p/14698276.html