码迷,mamicode.com
首页 > 其他好文 > 详细

sklearn包

时间:2019-12-16 17:46:37      阅读:124      评论:0      收藏:0      [点我收藏+]

标签:转化   tom   model   nal   and   训练   center   encoding   ESS   

6.3 preprocessing data数据预处理
https://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling
归一化、正则化、标准化的区别
https://blog.csdn.net/tianguiyuyu/article/details/80694669
6.3.1 Standardization, or mean removal and variance scaling标准化(均值为0,方差为1)
preprocessing.scale
preprocessing.StandardScaler 在训练样本上使用后,可以同时应用到测试样本
6.3.1.1. Scaling features to a range
preprocessing.MinMaxScaler 把数据标准化到指定的最大值最小值之间
preprocessing.MaxAbsScaler 把数据标准化到指定的最大的绝对值之间
6.3.1.2. Scaling sparse data
preprocessing.MaxAbsScaler(要用transform API)
preprocessing.maxabs_scale
6.3.1.3. Scaling data with outliers
robust_scale
RobustScaler(要用transform API)
6.3.1.4. Centering kernel matrices
KernalCenterer
6.3.2. Non-linear transformation 非线性转化
6.3.2.1. Mapping to a Uniform distribution
QuantileTransformer
quantile_transform
6.3.2.2. Mapping to a Gaussian distribution
PowerTransformer
6.3.3. Normalization 归一化
Normalization is the process of scaling individual samples to have unit norm.
normalize
Normalizer(要用transform API)
6.3.4. Encoding categorical features
OrdinalEncoder(顺序编码)
OneHotEncoder
6.3.5. Discretization离散化
For instance, pre-processing with a discretizer can introduce nonlinearity to linear models.
6.3.5.1. K-bins discretization
The ‘uniform’ strategy uses constant-width bins. The ‘quantile’ strategy uses the quantiles values to have equally populated bins in each feature. The ‘kmeans’ strategy defines bins based on a k-means clustering procedure performed on each feature independently.
6.3.5.2. Feature binarization(二值化)
preprocessing.Binarizer(threshold=1.1)
6.3.6. Imputation of missing values
6.3.7. Generating polynomial features
from sklearn.preprocessing import PolynomialFeatures
PolynomialFeatures(degree=3, interaction_only=True)
6.3.8. Custom transformers(定制化转化)
convert an existing Python function into a transformer to assist in data cleaning or processing

sklearn包

标签:转化   tom   model   nal   and   训练   center   encoding   ESS   

原文地址:https://www.cnblogs.com/ironan-liu/p/12050019.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!