机器学习笔记(Washington University)- Classification Specialization-week six & week 7

1. Precisoin and recall

precision is how precise i am at showing good stuff on my website

recall is how good i am at find all the postive reviews


  Predicted y=1 Predicted y =-1
True label =  1 true positive false negative
True label = -1 false positive true negative






precision = number of true positives / (number of true positives + number of false positives)

recall      = number of true positives / (number of true positives + number of false negatives)

Pessimistic model : high precision low recall

Optimistic model: low precision high recall


2. Stochastic ascent

Gradient ascent is slow because every update requires a full pass over data.

Stochastic gradient ascent only use only small subsets of data

Stochastic gradient converges faster than gradient ascent however it is very sensitive to parameters like the step size

Gradient is direction of steepest direction, but any direction that goes up would be useful for ascent.

Stocahstic gradient works for most data points are pointing in an upward direction.

At the end , stochastic ascent oscillates a bit (noisy) around the optimal.


1. Systematic order in data can introduce significant bias 

  • shuffle the data before running stochastic ascent

2. if step size is small, the convergence takes a long time but if large, it oscilate much and behave crazy

  • step size that decreases with iterations is very important(Divided by iteration)

3. Never fully converge so do not trust last coefficients

  •  output the average weghts vector, 1/T(W1+... +WT)

