Least Squares & Nearest Neighbors

时间：2016-03-05 13:00:22 阅读：142 评论：0 收藏：0 [点我收藏+]

标签：

1.线性模板和最小平方

·线性回归也可用于简单的分类，boundary虽然简单，但模型势必不准确。

·存在问题：

ESL P13：两种场景

技术分享

·scikit-learn:

LinearModel.LinearRegression()

class LinearRegression(LinearModel, RegressorMixin):
    """
    Ordinary least squares Linear Regression.

    Parameters
    ----------
    fit_intercept :（拟合截距） boolean, optional
        whether to calculate the intercept for this model. If set
        to false, no intercept will be used in calculations
        (e.g. data is expected to be already centered).

    normalize : boolean, optional, default False
        If True, the regressors X will be normalized before regression.

    copy_X : boolean, optional, default True
        If True, X will be copied; else, it may be overwritten.

    n_jobs : int, optional, default 1
        The number of jobs to use for the computation.
        If -1 all CPUs are used. This will only provide speedup for
        n_targets > 1 and sufficient large problems.

    Attributes
    ----------
    coef_ : 系数，斜率。array, shape (n_features, ) or (n_targets, n_features)
        Estimated coefficients for the linear regression problem.
        If multiple targets are passed during the fit (y 2D), this
        is a 2D array of shape (n_targets, n_features), while if only
        one target is passed, this is a 1D array of length n_features.

    intercept_ : 截距。array
        Independent term in the linear model.

2.最近邻模型

·样本之间的欧式距离

·k-nearest-neighbors:随着k的增加，分类准确率提高，错误率下降；反之亦反，但或造成过拟合。实际上有效的参数是N/k，而非k。下图表现出N/k，k，error之间的关系。

技术分享