标签:batch die back model clip ati bat log class
1. 对w进行初始化
2. clip gradients
1 optimizer.zero_grad() 2 logit = model(feature) 3 loss = F.cross_entropy(logit, target) 4 loss.backward() 5 # clip gradients 6 utils.clip_grad_norm(model.parameters(), 1e-4) 7 optimizer.step()
3. l2 regularization
4. batch normalization
标签:batch die back model clip ati bat log class
原文地址:http://www.cnblogs.com/Joyce-song94/p/7347775.html