码迷,mamicode.com
首页 > 编程语言 > 详细

【Python-Opencv】KNN英文字母识别

时间:2015-04-20 15:01:11      阅读:427      评论:0      收藏:0      [点我收藏+]

标签:

特征集分析

数据集为letter-recognition.data,一共为20000条数据,以逗号分隔,数据实例如下所示,第一列为字母标记,剩下的为不同的特征。
T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8

学习方法

1、读入数据,并去除分隔号

2、将数据第一列作为标记,剩下的为训练数据

3、初始化分类器并利用训练数据进行训练

4、利用测试数据验证准确率

代码

<span style="font-size:14px;">
</span><span style="font-family:Courier New;font-size:12px;">import cv2
import numpy as np
import matplotlib.pyplot as plt

print 'load data'
data = np.loadtxt('letter-recognition.data',dtype = 'float32',delimiter = ',',
                  converters= {0: lambda ch: ord(ch)-ord('A')})

print 'split as train,test'
train,test = np.vsplit(data,2)

print 'train.shape:\t',train.shape
print 'test.shape:\t',test.shape

print 'split train as the response,trainData'
response,trainData = np.hsplit(train,[1])
print 'response.shape:\t',response.shape
print 'trainData.shape:\t',trainData.shape

print 'split the test as response,trainData'
restest,testData = np.hsplit(test,[1])

print 'Init the knn'
knn = cv2.KNearest()
knn.train(trainData,response)

print 'test the knn'
ret,result,neighbours,dist = knn.find_nearest(testData,5)

print 'the rate:'
correct = np.count_nonzero(result == restest)
accuracy = correct*100.0/10000
print 'accuracy is',accuracy,'%'</span>

结果

load data
split as train,test
train.shape:	(10000, 17)
test.shape:	(10000, 17)
split train as the response,trainData
response.shape:	(10000, 1)
trainData.shape:	(10000, 16)
split the test as response,trainData
Init the knn
test the knn
the rate:
accuracy is 93.22 %

数据集

http://download.csdn.net/detail/licong_carp/8612383


【Python-Opencv】KNN英文字母识别

标签:

原文地址:http://blog.csdn.net/licong_carp/article/details/45149197

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!