标签:gen line 一个 预测概率 期望 作用 变量 rop log
\(H(X) = -\sum_{i=1}^{n} P(x_i) logP(x_i)\)
$-log(P_1 P_2) = -log(P_1) - log(P_2)$
\(D_{KL}(p||q) = \sum_{i=1}^{n} p(x_i) log \frac{p(x_i)}{q(x_i)} = -\sum_{i=1}^{n} p(x_i) * log q(x_i) - H(p) = CE(p, q) - H(p)\)
\(CE(p, q) = -\sum_{i=1}^{n} p(x_i) * log q(x_i) = D_{KL}(p||q) + H(p)\)
其中p为真实概率分布,q为预测概率分布
最小化交叉熵等价于最小化KL散度: KL散度等于交叉熵减去数据真实分布的熵,而后者是确定的
最小化交叉熵等价于最大化似然函数
令 \(A_i = q(x_i)^{y_i}; \ \ B_i = (1-q(x_i))^{(1-y_i)}\)
样本\(x_i\)取1的概率为\(q(x_i)\)
交叉熵:\(Loss(y, \hat{y}) = -\sum_{i=1}^{n} y_i * log(\hat{y_i}) = -\sum_{i=1}^{n} [y_i * log(q(x_i)) + (1-y_i) * log(1-q(x_i))] = -\sum_{i=1}^{n} [log(q(x_i)^{y_i}) + log(1-q(x_i))^{(1-y_i)}] = -\sum_{i=1}^{n} log(A_i*B_i) = -log [\Pi_{i=1}^{n}(A*B)]\)
似然函数:\(\Pi_{i=1}^{n}(A*B)\)
标签:gen line 一个 预测概率 期望 作用 变量 rop log
原文地址:https://www.cnblogs.com/albertsr/p/9863835.html