A review of gradient descent optimization methods

时间：2018-06-08 10:34:15 阅读：147 评论：0 收藏：0 [点我收藏+]

标签：param amp cal com lte hand The pytho rate

Suppose we are going to optimize a parameterized function \(J(\theta)\), where \(\theta \in \mathbb{R}^d\), for example, \(\theta\) could be a neural net.

More specifically, we want to \(\mbox{ minimize } J(\theta; \mathcal{D})\) on dataset \(\mathcal{D}\), where each point in \(\mathcal{D}\) is a pair \((x_i, y_i)\).

There are different ways to apply gradient descent.

Let \(\eta\) be the learning rate.

Vanilla batch update
\(\theta \gets \theta - \eta \nabla J(\theta; \mathcal{D})\)
Note that \(\nabla J(\theta; \mathcal{D})\) computes the gradient on of the whole dataset \(\mathcal{D}\).
```python
for i in range(n_epochs):
gradient = compute_gradient(J, theta, D)
theta = theta - eta * gradient
eta = eta * 0.95

```
It is obvious that when \(\mathcal{D}\) is too large, this approach is unfeasible.

Stochastic Gradient Descent
Stochastic Gradient, on the other hand, update the parameters example by example.
\(\theta \gets \theta - \eta *J(\theta, x_i, y_i)\), where \((x_i, y_i) \in \mathcal{D}\).
```
for n in range(n_epochs):
for x_i, y_i in D: 
    gradient=compute_gradient(J, theta, x_i, y_i)
    theta = theta - eta * gradient 
eta = eta * 0.95 
```
Mini-batch Stochastic Gradient Descent
Update \(\theta\) example by example could lead to high variance, the alternative approach is to update \(\theta\) by mini-batches \(M\) where \(|M| \ll |\mathcal{D}|\).
```
for n in range(n_epochs):
for M in D: 
    gradient = compute_gradient(J, M)
    theta = theta - eta * gradient 
eta = eta * 0.95
```

Question? Why decaying the learning rate leads to convergence?

A review of gradient descent optimization methods

标签：param amp cal com lte hand The pytho rate

原文地址：https://www.cnblogs.com/gaoqichao/p/9153675.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行