标签:ext width tput bsp adr die structure and enter
The Multiclass SVM loss for the i-th example is then formalized as follows:
L_i = \sum_{j\neq y_i} \max(0, s_j - s_{y_i} + \Delta) |
Regularization :
The most common regularization penalty is the L2 norm that discourages large weights through an elementwise quadratic penalty over all parameters:
R(W) = \sum_k\sum_l W_{k,l}^2 |
That is, the full Multiclass SVM loss becomes:
L = \frac{1}{N} \sum_i \sum_{j\neq y_i} \left[ \max(0, f(x_i; W)_j - f(x_i; W)_{y_i} + \Delta) \right] + \lambda \sum_k\sum_l W_{k,l}^2 |
def svm_loss_vectorized(W, X, y, reg): """ Structured SVM loss function, vectorized implementation.Inputs and outputs are the same as svm_loss_naive. """ loss = 0.0 dW = np.zeros(W.shape) # initialize the gradient as zero scores = X.dot(W) # N by C num_train = X.shape[0] num_classes = W.shape[1] scores_correct = scores[np.arange(num_train), y] # 1 by N scores_correct = np.reshape(scores_correct, (num_train, 1)) # N by 1 margins = scores - scores_correct + 1.0 # N by C margins[np.arange(num_train), y] = 0.0 margins[margins <= 0] = 0.0 loss += np.sum(margins) / num_train loss += 0.5 * reg * np.sum(W * W) # compute the gradient margins[margins > 0] = 1.0 row_sum = np.sum(margins, axis=1) # 1 by N margins[np.arange(num_train), y] = -row_sum dW += np.dot(X.T, margins)/num_train + reg * W # D by C return loss, dW
标签:ext width tput bsp adr die structure and enter
原文地址:http://www.cnblogs.com/zhangli-ncu/p/7608659.html