在机器学习中,导致overfitting的原因之一是noise,这个noise可以分为两种,即stochastic noise,随机噪声来自数据产生过程,比如测量误差等,和deterministic noise,确定性噪声来自added complexity,即model too complex。这两种类型的造成来源不同,但是对于学习的影响是相似的,large noise总会导致overfitting。
This is a very subtle question!
The most important thing to realize is that in learning,
i) If there is stochastic noise with ‘magnitude’
ii) If there deterministic noise then you are in trouble.
The stochastic noise can be viewed as one part of the data generation process (eg. measurement errors). The deterministic noise can similarly be viewed as another part of the data generation process, namely f. The deterministic and stochastic noise are fixed. In your analogy, you can increase the stochastic noise by increasing the noise variance and you get into deeper trouble. Similarly, you can increase the deterministic noise by making f more complex and you will get into deeper trouble.
I just need to tell you what ‘trouble’ means. Well, we actually use another word instead of ‘trouble’ - overfitting.
This means you may be likely to make an inferior choice over the superior choice because the inferior choice has lower in-sample error. Doing stuff that looks good in-sample that leads to disasters out-of-sample is the essence of overfitting. An example of this is trying to choose the regularization parameter. If you pick a lower regularization parameter, then you have lower in-sample error, but it leads to higher out-of-sample error - you picked the
Now let’s get back to the subtle part of your question. There is actually another way to decrease the deterministic noise - increase the complexity of
上面一段主要摘自《learning from data》一书,主要说明的内容是overfitting的含义以及noise对于overfitting的效用。
下面是对overfitting的很好的总结:
VC维大=>模型复杂度高=>error in sample 小=>模型不够平滑=>generalization能力弱=>error out of sample大=>overfitting=>模型并没有卵用。
总的来说,deterministic noise是由于你选择的
deterministic function可用来生成伪随机数(pseudo-random generator)。
详细的论述可以参看《learning from data》
2015-8-27
艺少
版权声明:本文为博主原创文章,未经博主允许不得转载。
stochastic noise and deterministic noise
原文地址:http://blog.csdn.net/lg1259156776/article/details/48028333