scikit-learn（工程中用的相对较多的模型介绍）：1.4. Support Vector Machines

时间：2015-08-04 08:13:35 阅读：358 评论：0 收藏：0 [点我收藏+]

标签：机器学习 scikit-learn support vector machi svm 工程应用

参考：http://scikit-learn.org/stable/modules/svm.html

在实际项目中，我们真的很少用到那些简单的模型，比如LR、kNN、NB等，虽然经典，但在工程中确实不实用。

今天我们关注在工程中用的相对较多的SVM。

SVM功能不少：Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and outliers detection.

好处多多：高维空间的高效率；维度大于样本数的有效性；仅使用训练点的子集（称作支持向量），空间占用少；有不同的kernel functions供选择。

也有坏处：维度大于样本数的有效性----但维度如果相对样本数过高，则效果会非常差；不能直接提供概率估计，需要通过an expensive five-fold cross-validation (see Scores and probabilities, below).才能实现。

（SVM支持dense和sparse sample vectors，但是如果预测使用的sparse data，那训练也要使用稀疏数据。为了发挥SVM效用，请use C-ordered numpy.ndarray (dense) or scipy.sparse.csr_matrix (sparse) with dtype=float64.）

1、分类

SVC, NuSVC and LinearSVC 是三个可以进行multi-class分类的模型。三者的本质区别就是 have different mathematical formulations，具体参考本文最后的公式。

SVC, NuSVC and LinearSVC 和其他分类器一样，使用fit、predict方法：

>>> from sklearn import svm
>>> X = [[0, 0], [1, 1]]
>>> y = [0, 1]
>>> clf = svm.SVC()
>>> clf.fit(X, y)  
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
gamma=0.0, kernel=‘rbf‘, max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False)

After being fitted, the model can then be used to predict new values:

>>>
>>> clf.predict([[2., 2.]])
array([1])

SVM中的支持向量的相关属性可以使用 support_vectors_, support_ and n_support来获取：

>>> # get support vectors
>>> clf.support_vectors_
array([[ 0.,  0.],
       [ 1.,  1.]])
>>> # get indices of support vectors
>>> clf.support_ 
array([0, 1]...)
>>> # get number of support vectors for each class
>>> clf.n_support_ 
array([1, 1]...)

对于multi-class分类：

SVC and NuSVC 的机制是“one-against-one”（training n_class * (n_class - 1) / 2个 models），而 LinearSVC 的策略是“one-vs-the-rest”（training n_class个 models）。而实践中，one-vs-rest是常用和较好的，因为结果其实差不多，但时间省好多。。。

[python]view
 plaincopy

>>> X = [[0], [1], [2], [3]]  

>>> Y = [0, 1, 2, 3]  

>>> clf = svm.SVC()  

>>> clf.fit(X, Y)   

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,  

gamma=0.0, kernel=‘rbf‘, max_iter=-1, probability=False, random_state=None,  

shrinking=True, tol=0.001, verbose=False)  

>>> dec = clf.decision_function([[1]])  

>>> dec.shape[1] # 4 classes: 4*3/2 = 6  

6  

>>> lin_clf = svm.LinearSVC()  

>>> lin_clf.fit(X, Y)   

LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,  

     intercept_scaling=1, loss=‘squared_hinge‘, max_iter=1000,  

     multi_class=‘ovr‘, penalty=‘l2‘, random_state=None, tol=0.0001,  

     verbose=0)  

>>> dec = lin_clf.decision_function([[1]])  

>>> dec.shape[1]  

4

关于样本所属类别的confidence：The SVC method decision_function gives per-class scores for each sample。另外还有所谓的option probability，但是，If confidence scores are required, but these do not have to be probabilities, then it is advisable to set probability=False and use decision_function instead of predict_proba.（主要是因为probability的理论背景有缺陷）

在每个class或者sample的权重不同的情况下，可以设置keywords class_weight andsample_weight ：

类别权重：SVC (but not NuSVC) implement a keyword class_weight in the fit method. It’s a dictionary of the form {class_label : value}, where value is a floating point number > 0 that sets the parameter C of class class_label to C * value.

样本权重：SVC, NuSVC, SVR, NuSVR and OneClassSVM implement also weights for individual samples in method fit through keyword sample_weight. Similar to class_weight, these set the parameter C for the i-th example to C * sample_weight[i].

最后给几个例子：

2、回归

Support Vector Regression.

看能明白这句话不能：Analogously（to SVClassfication）, the model produced by Support Vector Regression depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction.

同样也是三个模型： SVR, NuSVR and LinearSVR。

>>> from sklearn import svm
>>> X = [[0, 0], [2, 2]]
>>> y = [0.5, 2.5]
>>> clf = svm.SVR()
>>> clf.fit(X, y) 
SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma=0.0,
    kernel=‘rbf‘, max_iter=-1, shrinking=True, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.5])

给个例子：

Support Vector Regression (SVR) using linear and non-linear kernels

未完待续。。。

scikit-learn（工程中用的相对较多的模型介绍）：1.4. Support Vector Machines

标签：机器学习 scikit-learn support vector machi svm 工程应用

原文地址：http://blog.csdn.net/mmc2015/article/details/47271039

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行