标签:
Normally vector is column vector, it is defined as:
$$\beta=(\beta_1,\dots,\beta_k)$$
<ol>
<li>For scalar function $f(\beta)$, derivative with respect to column vector $\beta$ defined as:</li>
$$\frac{\partial f(\beta)}{\partial \beta}=
\begin{pmatrix}
\frac{\partial f(\beta)}{\partial \beta_1} \\
\vdots \\
\frac{\partial f(\beta)}{\partial \beta_k}
\end{pmatrix}$$
<li>For scalar function $f(\beta)$, derivative with respect to row vector $\beta$ defined as:</li>
$$\frac{\partial f(\beta)}{\partial \beta^\intercal}=(\frac{\partial f(\beta)}{\partial \beta_1},\dots,\frac{\partial f(\beta)}{\partial \beta_k})$$
So $$\frac{\partial f(\beta)}{\partial \beta^\intercal}= (\frac{\partial f(\beta)}{\partial \beta})^\intercal$$
<li>For column vector function of $\beta$</li>
$$g(\beta)=
\begin{pmatrix}
g_1(\beta) \\
\vdots \\
g_n(\beta)
\end{pmatrix}$$
the derivative of row vector $\beta^\intercal$ defined as:
$$
\frac{\partial g(\beta)}{\partial \beta^\intercal}=
\begin{pmatrix}
\frac{\partial g_1(\beta)}{\partial \beta_1} & \cdots & \frac{\partial g_1(\beta)}{\partial \beta_k} \\
\vdots & \vdots & \vdots \\
\frac{\partial g_n(\beta)}{\partial \beta_1} & \cdots & \frac{\partial g_n(\beta)}{\partial \beta_k} \\
\end{pmatrix}
$$
So:
$$\frac{\partial g(\beta)}{\partial \beta^\intercal}_{i,j}=\frac{\partial g_i(\beta)}{\partial \beta_j}$$
Same way can also define row vector function derivative with respect to column vector $\beta$, have:
$$
\frac{\partial g(\beta)^\intercal}{\partial \beta}=(\frac{\partial g(\beta)}{\partial \beta^\intercal})^\intercal
$$
<li>For Hessian Matrix of scalar function of $\beta$ defined as:</li>
$$
\frac{\partial^2 f(\beta)}{\partial \beta^\intercal\partial \beta}=
\frac{\partial^2 f(\beta)}{\partial \beta\partial \beta^\intercal}=
\begin{pmatrix}
\frac{\partial^2 f(\beta)}{\partial \beta_1\partial \beta_1} & \cdots & \frac{\partial^2 f(\beta)}{\partial \beta_1\partial \beta_k}\\
\vdots & \vdots & \vdots \\
\frac{\partial^2 f(\beta)}{\partial \beta_k\partial \beta_1} & \cdots & \frac{\partial^2 f(\beta)}{\partial \beta_k\partial \beta_k}\\
\end{pmatrix}
$$
So:
$$\frac{\partial^2 f(\beta)}{\partial \beta^\intercal\partial \beta}_{i,j}=\frac{\partial^2 f(\beta)}{\partial \beta_i\partial \beta_i}
$$
<li>suppose $c(k \times 1), \beta(k \times 1)$, from above equations: </li>
$$
\frac{\partial (c^\intercal\beta)}{\partial \beta}=c, \frac{\partial (\beta^\intercal c)}{\partial \beta}=c
$$
<li>suppose $A(n \times k), \beta(k \times 1), A\beta(n \times 1)$ from above equations: </li>
$$
\frac{\partial (A\beta)}{\partial \beta^\intercal}=A, \frac{\partial (\beta^\intercal A^\intercal)}{\partial \beta}=A^\intercal
$$
<li>suppose scalar function $f(\beta)=\beta^\intercal V \beta$ for matrix $V(k \times k)$ , from above equations: </li>
$$
\frac{\partial (\beta^\intercal V \beta)}{\partial \beta}=(V+V^\intercal)\beta,
\frac{\partial (\beta^\intercal V \beta)}{\partial \beta^\intercal}=\beta^\intercal (V+V^\intercal)
$$
<li>suppose A and B are matrix: </li>
$$
\frac{\partial tr(BA)}{\partial A}=B^\intercal
$$
<li>suppose A is matrix: </li>
$$
\frac{\partial log(|A|)}{\partial A}=(A^{-1})^\intercal
$$
</ol>
标签:
原文地址:http://www.cnblogs.com/chihyang/p/4356732.html