码迷,mamicode.com
首页 > 其他好文 > 详细

sklearn线性回归

时间:2020-05-05 23:28:54      阅读:76      评论:0      收藏:0      [点我收藏+]

标签:cti   print   多次   near   ict   参数   linear   rom   dataset   

1.多项式回归(Polynomial Regression).

"一元多项式回归": 自变量只有一个 "多元多项式回归": 自变量有多个。 

一元n次多项式:$\hat{y}=w_{0}+w_{1} x_{1}+ w_{2} x^{2}+\cdots+w_{n} x^{n}$

多元多次多项式(二元二次多项式为例):$\hat{y}=w_{0}+w_{1} x_{1}+ w_{2} x^{2}+w_{3}x_{1}^{2}+w_{4}x_{2}^{2}+w_{5} x_{1}x_{2}$

2.sklearn拟合n元一次多项式

$\hat{y}=w_{0}+\sum_{i=1}^{n}w_{i} x_{i}$

from sklearn.linear_model import LinearRegression
    datasets_X = []  # 形如[[1,2,3],[4,5,6],[7,8,9]]
    datasets_Y = []  # 形如[6,15,24]
    fr = open(train.dat, r)
    lines = fr.readlines()
    for line in lines:
        items = line.strip().split(\t)
        datasets_X.append([float(ele) for ele in items[:-1]])
        datasets_Y.append(float(items[-1]))
    model = LinearRegression()
    model.fit(datasets_X, datasets_Y)
    # 加载测试数据,这儿直接用的训练数据
    X_test = datasets_X
    y_test = datasets_Y
    predictions = model.predict(X_test)
    for i, prediction in enumerate(predictions):
        print(Predict_value: %s, True_value: %s % (prediction, y_test[i]))
    print(R2-squared: %.2f % model.score(X_test, y_test))
    print(model.coef_)  # 参数
    print(model.intercept_)  # 偏置bias
    exit()

model.coef_就是一个参数列表,对应$w_{1}$,$w_{2}$,$\cdots$,$w_{n}$,model.intercept_就是偏置$w_{0}$

3.sklearn拟合n元n次多项式

以三元三次多项式为例

一共有19项特征加一个偏置常数项

$\textbf{f}=[x_{1}、 x_{2}、 x_{3}、 x_{1}x_{1}、 x_{2}x_{2}、 x_{3}x_{3}、 x_{1}x_{2}、 x_{2}x_{3}、 x_{1}x_{3}、 x_{1}x_{1}x_{2}、 x_{1}x_{1}x_{3}、 x_{2}x_{2}x_{1}、 x_{2}x_{2}x_{3}、 x_{3}x_{3}x_{1}、 x_{3}x_{3}x_{2}、 x_{1}x_{1}x_{1}、 x_{2}x_{2}x_{2}、 x_{3}x_{3}x_{3}、 x_{1}x_{2}x_{3}]$

$\hat{y}=w_{0}+\sum_{i=1}^{19}w_{i} f_{i}$

 

from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
datasets_X = [[1,2,3],[4,5,6],[7,8,9]]
datasets_Y = [6,15,24]
poly_feat = PolynomialFeatures(degree=3)
datasets_X_ploy = poly_feat.fit_transform(datasets_X)
model = LinearRegression()
model.fit(datasets_X_ploy, datasets_Y)
print(model.coef_)  # 参数
print(model.intercept_)  # 偏置bias

# model.coef_ = [-5.86336535e-16  4.23053648e-03  4.23053648e-03  4.23053648e-03
#   9.72135088e-03  1.39518874e-02  1.81824238e-02  1.81824238e-02
#   2.24129603e-02  2.66434968e-02 -4.83347121e-02 -3.86133612e-02
#  -2.88920103e-02 -2.46614738e-02 -1.07095864e-02  7.47283740e-03
#  -6.47904996e-03  1.17033739e-02  3.41163342e-02  6.07598310e-02]
 # model.intercept_ = 3.400113258456912

 

sklearn线性回归

标签:cti   print   多次   near   ict   参数   linear   rom   dataset   

原文地址:https://www.cnblogs.com/sunupo/p/12833508.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!