标签:div 数值 训练 ali value return href mic [1]
k近邻法:为新的输入在数据集中找到与该实例距离最短的k个实例,这k个实例的多数属于某个类,则把该实例归入到该类中(“多数表决”规则)
算法实现
1 #-*-coding:utf-8 -*-
2 import numpy as np
3 import math
4
5 #训练集和类别
6 def creatDataSet( ):
7 group = np.array([[1.0,2.0],[1.2,0.1],[0.1,1.4],[0.3,3.5]])
8 label = [‘A‘,‘A‘,‘B‘,‘B‘]
9 return group,label
10
11 #确定输入实例以及k值
12 def inputData( ):
13 test_data=np.array([ [1.1 , 0.3] ])
14 K = 3
15 return test_data,K
16
17 #归一化
18 def Normalization( group ):
19 max = np.amax(group,axis=0)
20 min = np.amin(group,axis=0)
21 for i in range(len(group)):
22 group[i][0] /= (max[ 0 ] - min[ 0 ])
23 group[i][1] /= (max[ 1 ] - min[ 1 ])
24 return group
25
26 #求输入实例与训练集中所有点的欧氏距离
27 def Distance( group ,dis ):
28 test,K=inputData( )
29 group = Normalization( group )
30 for i in range(len(dis)):
31 dis[i] = math.sqrt((group[i][0] - test[0][0]) ** 2 + (group[i][1] - test[0][1]) ** 2 )
32 return dis,K,test
33
34 #找出输入实例的类别
35 def Classify_By_KNN( group , label ):
36 size = len( label )
37 dis = np.zeros( size )
38 dis,k,test = Distance( group ,dis )
39 sortedDistIndex=np.argsort(dis)
40 countLabel={}
41 for i in range(k):
42 l=label[sortedDistIndex[i]]
43 countLabel[l] = countLabel.get(l,0) + 1
44 countMax=max(countLabel.values())
45 for key,value in countLabel.items():
46 if value == countMax:
47 print("测试数据为:",test," 分类结果为:",key)
48
49 g,l=creatDataSet( )
50 Classify_By_KNN( g , l )
运行结果:
参考文献:
1. 《统计学习方法》李航
2. 知乎:一文搞懂k近邻算法
https://zhuanlan.zhihu.com/p/25994179
https://zhuanlan.zhihu.com/p/26029567
标签:div 数值 训练 ali value return href mic [1]
原文地址:https://www.cnblogs.com/bobomain/p/10730486.html