码迷,mamicode.com
首页 > 移动开发 > 详细

CCJ PRML Study Note - Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

时间:2016-06-21 06:27:19      阅读:269      评论:0      收藏:0      [点我收藏+]

标签:

Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

 
 

Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

 

Christopher M. Bishop, PRML, Chapter 1 Introdcution

1. Notations and Logical Relation

  • Training data: 技术分享 input values 技术分享 and their corresponding target values 技术分享. For simplicity, written as 技术分享.
  • Goal of Making Prediction: to be able to make predictions for the target variable 技术分享 given some new value of the input variable 技术分享.
  • Assumption of the predictive distribution over 技术分享: we shall assume that, given the value of 技术分享, the corresponding value of 技术分享 has a Gaussian distribution with a mean equal to the value y(x, w) of the polynomial curve given by (1.1). Thus we have 技术分享
  • Likelihood function of i.i.d. training data 技术分享: 技术分享 技术分享
  • MLE of parameters 技术分享 and 技术分享 :
    • 技术分享 for linear regression
      技术分享
    • 技术分享: 技术分享
  • ML plugin prediction for new values of 技术分享: substituting the maximum likelihood parameters into (1.60) to give

技术分享

  • Prior distribution over 技术分享: For simplicity, let us consider a Gaussian distribution of the form 技术分享 where
    • hyperparameter 技术分享 is the precision of the distribution,
    • M +1 is the total number of elements in the vector 技术分享 for an 技术分享 order polynomial.
  • Posterior distribution for 技术分享: using Bayes’ Theorem, 技术分享
  • MAP: a step towards a more Bayesian approach, note MAP is still a point estimate. We find that the maximum of the posterior is given by the minimum of技术分享

Although we have included a prior distribution 技术分享, we are so far still making a point estimate of 技术分享 and so this does not yet amount to a Bayesian treatment. In a fully Bayesian approach, we should consistently apply the sum and product rules of probability, which requires, as we shall see shortly, that we integrate over all values of w. Such marginalizations lie at the heart of Bayesian methods for pattern recognition.

  • Fully Bayesian approach:
    • Here we shall assume that the parameters 技术分享 and 技术分享 are fixed and known in advance (in later chapters we shall discuss how such parameters can be inferred from data in a Bayesian setting).
    • A Bayesian treatment simply corresponds to a consistent application of the sum and product rules of probability, which allow the predictive distribution to be written in the form
      技术分享
  • Result of Integration in (1.68):
    • (1.66): this posterior distribution is a Gaussian and can be evaluated analytically.
    • (1.68) can also be performed analytically with the result that the predictive distribution is given by a Gaussian of the form 技术分享 where the mean and variance are given by 技术分享 Here the matrix S is given by 技术分享 where 技术分享 is the unit matrix, and we have defined the vector 技术分享 with elements 技术分享 for 技术分享.

2. Flowchart

The relation between all of those equations or notions above:

技术分享

 

CCJ PRML Study Note - Chapter 1 Summary : MLE (Maximum-likelihood Estimate) and Bayesian Approach

标签:

原文地址:http://www.cnblogs.com/glory-of-family/p/5602315.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!