码迷,mamicode.com
首页 > 其他好文 > 详细

机器学习-数据泄露

时间:2018-11-03 16:31:36      阅读:345      评论:0      收藏:0      [点我收藏+]

标签:eps   you   example   app   pes   diff   ant   rate   float   

Many datasets contain features of different types, say text, floats, and dates, where each type of feature requires separate preprocessing or feature extraction steps. Often it is easiest to preprocess data before applying scikit-learn methods, for example using pandas. Processing your data before passing it to scikit-learn might be problematic for one of the following reasons:

Incorporating statistics from test data into the preprocessors makes cross-validation scores unreliable (known as data leakage), for example in the case of scalers or imputing missing values.
You may want to include the parameters of the preprocessors in a parameter search.

机器学习-数据泄露

标签:eps   you   example   app   pes   diff   ant   rate   float   

原文地址:https://www.cnblogs.com/wdmx/p/9900963.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!