标签:最快 sgd ram algorithm inline dom 停止 比较 lan
An overview of gradient descent optimization algorithms
梯度方向是函数变化率最大的方向,是函数增长最快的方向。
ex: 从山上走到谷底
\(x_j^{(i+1)} = x_j^{(i)}-\eta \cdot \frac{\partial f}{\partial x_j}(x^{(i)})\), 对\(i>0\). 表示第j个参数,第i次迭代。
常见变形有:BGD,SGD,MBGD等等
for i in range ( nb_epochs ):
params_grad = evaluate_gradient ( loss_function , data , params )
params = params - learning_rate * params_grad
for i in range ( nb_epochs ):
np . random . shuffle ( data )
for example in data :
params_grad = evaluate_gradient ( loss_function , example , params )
params = params - learning_rate * params_grad
for i in range ( nb_epochs ):
np.random.shuffle(data)
for batch in get_batches ( data , batch_size =50):
params_grad = evaluate_gradient ( loss_function , batch , params )
params = params - learning_rate * params_grad
标签:最快 sgd ram algorithm inline dom 停止 比较 lan
原文地址:https://www.cnblogs.com/xuwanwei/p/13197002.html