码迷,mamicode.com
首页 > 其他好文 > 详细

A review of gradient descent optimization methods

时间:2018-06-08 10:34:15      阅读:147      评论:0      收藏:0      [点我收藏+]

标签:param   amp   cal   com   lte   hand   The   pytho   rate   

Suppose we are going to optimize a parameterized function \(J(\theta)\), where \(\theta \in \mathbb{R}^d\), for example, \(\theta\) could be a neural net.

More specifically, we want to \(\mbox{ minimize } J(\theta; \mathcal{D})\) on dataset \(\mathcal{D}\), where each point in \(\mathcal{D}\) is a pair \((x_i, y_i)\).

There are different ways to apply gradient descent.

Let \(\eta\) be the learning rate.

  1. Vanilla batch update
    \(\theta \gets \theta - \eta \nabla J(\theta; \mathcal{D})\)
    Note that \(\nabla J(\theta; \mathcal{D})\) computes the gradient on of the whole dataset \(\mathcal{D}\).
    ```python
    for i in range(n_epochs):
    gradient = compute_gradient(J, theta, D)
    theta = theta - eta * gradient
    eta = eta * 0.95

```
It is obvious that when \(\mathcal{D}\) is too large, this approach is unfeasible.

  1. Stochastic Gradient Descent
    Stochastic Gradient, on the other hand, update the parameters example by example.
    \(\theta \gets \theta - \eta *J(\theta, x_i, y_i)\), where \((x_i, y_i) \in \mathcal{D}\).

    for n in range(n_epochs):
    for x_i, y_i in D: 
        gradient=compute_gradient(J, theta, x_i, y_i)
        theta = theta - eta * gradient 
    eta = eta * 0.95 
  2. Mini-batch Stochastic Gradient Descent
    Update \(\theta\) example by example could lead to high variance, the alternative approach is to update \(\theta\) by mini-batches \(M\) where \(|M| \ll |\mathcal{D}|\).

    for n in range(n_epochs):
    for M in D: 
        gradient = compute_gradient(J, M)
        theta = theta - eta * gradient 
    eta = eta * 0.95

Question? Why decaying the learning rate leads to convergence?

A review of gradient descent optimization methods

标签:param   amp   cal   com   lte   hand   The   pytho   rate   

原文地址:https://www.cnblogs.com/gaoqichao/p/9153675.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!