标签:http io os ar for sp 数据 on amp
This paper proposed a new cluster idea. The idea is that the cluster center is characterrized by a higher density than their neighbors and by a relatively large distance from points with highter density(1.一个类中的聚类中心的点的密度较高,2.不同聚类中心的距离较大).
Based on this assumption, for each data point , we compute two quantities: Its local density and its distane from point with higher density. Both quantities are based on this distance .The local density is defined as
where if <0 and otherwise, and is a cutshort distance.(影响变量有 ).
is measured by computing the minimum distance between the point and any other points with a higher density,(在密度比它大的数据点钟寻找距离最小的点).
在实际聚类中,首先画出一个叫做decision gragh 的东西,就是横坐标是 纵坐标是 的图,寻找聚类中心时是这两个值都比较大的情况。然后,有一个类似依附原则,数据点A的类别被分配在距离A最近的而且密度大于A的那个点所在的类别(梯度)。
以上是模型,也可以有一个噪声模型:为每一个类规定一个边界,边界定义为距离本类中心最近并且距离其他类的中心小于。这些被排除在外边的点称为Halo点。
以后的工作主要是实现这个算法,然后研究一下内部数学原理。
文章链接:
http://www.sciencemag.org/content/344/6191/1492.full
cluster by fast search and find of density peaks
标签:http io os ar for sp 数据 on amp
原文地址:http://www.cnblogs.com/taokongcn/p/4034348.html