码迷,mamicode.com
首页 > 其他好文 > 详细

scikit-learn:4.7. Pairwise metrics, Affinities and Kernels

时间:2015-07-26 17:24:43      阅读:167      评论:0      收藏:0      [点我收藏+]

标签:

参考:http://scikit-learn.org/stable/modules/metrics.html


The sklearn.metrics.pairwise submodule implements utilities to evaluate pairwise distances(样本对的距离) or affinity of sets of samples(样本集的相似度)。

Distance metrics are functions d(a, b) such that d(a, b) < d(a, c) if objects a and b are considered “more similar” than objects a and c

Kernels are measures of similarity, i.e. s(a, b) > s(a, c) if objects a and b are considered “more similar” than objects a and c


1、Cosine similarity

向量点积的L2-norm:

if 技术分享 and 技术分享 are row vectors, their cosine similarity 技术分享 is defined as:

技术分享

This kernel is a popular choice for computing the similarity of documents represented as tf-idf vectors.


2、Linear kernel

If x and y are column vectors, their linear kernel is:

技术分享(x, y) = x_transport * y


3、Polynomial kernel

Conceptually, the polynomial kernels considers not only the similarity between vectors under the same dimension, but also across dimensions. When used in machine learning algorithms, this allows to account for feature interaction.

The polynomial kernel is defined as:

技术分享


4、Sigmoid kernel

defined as:

技术分享




5、RBF kernel

defined as:

技术分享


If 技术分享 the kernel is known as the Gaussian kernel of variance 技术分享.



6、Chi-squared kernel

defined as:

技术分享

The chi-squared kernel is a very popular choice for training non-linear SVMs in computer vision applications. It can be computed usingchi2_kernel and then passed to an sklearn.svm.SVC with kernel="precomputed":

>>>
>>> from sklearn.svm import SVC
>>> from sklearn.metrics.pairwise import chi2_kernel
>>> X = [[0, 1], [1, 0], [.2, .8], [.7, .3]]
>>> y = [0, 1, 0, 1]
>>> K = chi2_kernel(X, gamma=.5)
>>> K                        
array([[ 1.        ,  0.36...,  0.89...,  0.58...],
       [ 0.36...,  1.        ,  0.51...,  0.83...],
       [ 0.89...,  0.51...,  1.        ,  0.77... ],
       [ 0.58...,  0.83...,  0.77... ,  1.        ]])

>>> svm = SVC(kernel=‘precomputed‘).fit(K, y)
>>> svm.predict(K)
array([0, 1, 0, 1])

It can also be directly used as the kernel argument:

>>>
>>> svm = SVC(kernel=chi2_kernel).fit(X, y)
>>> svm.predict(X)
array([0, 1, 0, 1])


版权声明:本文为博主原创文章,未经博主允许不得转载。

scikit-learn:4.7. Pairwise metrics, Affinities and Kernels

标签:

原文地址:http://blog.csdn.net/mmc2015/article/details/47068895

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!