码迷,mamicode.com
首页 > 编程语言 > 详细

查看neighbors大小对K近邻分类算法预测准确度和泛化能力的影响

时间:2018-07-12 13:08:06      阅读:104      评论:0      收藏:0      [点我收藏+]

标签:append   建模   分享   select   100%   png   图片   预测   k近邻   

代码:

 1 # -*- coding: utf-8 -*-
 2 """
 3 Created on Thu Jul 12 09:36:49 2018
 4 
 5 @author: zhen
 6 """
 7 """
 8     分析n_neighbors的大小对K近邻算法预测精度和泛化能力的影响
 9 """
10 from sklearn.datasets import load_breast_cancer
11 
12 from sklearn.model_selection import train_test_split
13 
14 from sklearn.neighbors import KNeighborsClassifier
15 
16 import matplotlib.pyplot as plt
17 
18 cancer = load_breast_cancer()
19 
20 x_train, x_test, y_train, y_test = train_test_split(
21         cancer.data, cancer.target, stratify=cancer.target, random_state=66)
22 
23 training_accuracy = []
24 
25 test_accuracy = []
26 
27 # n_neighbors取值从1~10
28 neighbors_settings = range(1, 11)
29 
30 for n_neighbors in neighbors_settings:
31     # 构建模型
32     clf = KNeighborsClassifier(n_neighbors=n_neighbors)
33     clf.fit(x_train, y_train)
34     # 记录训练集精度S
35     training_accuracy.append(clf.score(x_train, y_train))
36     # 记录泛化能力
37     test_accuracy.append(clf.score(x_test, y_test))
38     
39 plt.plot(neighbors_settings, training_accuracy, label="training accuracy")
40 plt.plot(neighbors_settings, test_accuracy, label="test accuracy")
41 
42 plt.xlabel("n_neighbors")
43 plt.ylabel("Accuracy")
44 
45 plt.legend()

结果:

技术分享图片

总结:在仅考虑单一近邻时,训练集上的预测结果十分完美(接近100%)。但随着邻居个数的增多,模型变得更简单(泛化能力越好),训练集精度也随之下降。为求得较好的预测精度和泛化能力,最佳性能在neighbors为6左右!

 

查看neighbors大小对K近邻分类算法预测准确度和泛化能力的影响

标签:append   建模   分享   select   100%   png   图片   预测   k近邻   

原文地址:https://www.cnblogs.com/yszd/p/9298214.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!