码迷,mamicode.com
首页 > 其他好文 > 详细

机器学习--决策树

时间:2018-10-01 01:10:11      阅读:261      评论:0      收藏:0      [点我收藏+]

标签:基础上   算法   ann   with   HERE   相关   dataset   离散化   rtu   

目录

Decision Tree

Pre:

如下图所示,决策树包含判断模块、终止模块。其中终止模块表示已得出结论。

相较于KNN,决策树的优势在于数据的形式很容易理解。

相关介绍

  1. 奥卡姆剃刀原则: 切勿浪费较多的东西,去做‘用较少的的东西,同样可以做好的事情’。
  2. 启发法:(heuristics,策略法)是指依据有限的知识(不完整的信心)在短时间内找到解决方案的一种技术。
  3. ID3算法:(Iterative Dichotomiser3 迭代二叉树3代) 这个算法是建立在奥卡姆剃刀原则的基础上:越是小型的决策树越优于大的决策树(简单理论)。

Tree construction

General approach to decison trees

  1. Collect : Any
  2. Prepare : This tree-building algorithm works only on nominal values(标称型数据), so any continuous values will need to quantized(离散化).
  3. Analyze :Any methods, need to visually inspect the tree after it is built.
  4. Train : Construct a tree data structure.
  5. Test : Calcuate the error rate with the learned tree
  6. use : This can be used in any supervised learning task, often, trees used to better understand the data

    ——《Machine Learning in Action》

Information Gain 信息增益

信息增益:在划分数据之前之后信息发生的变化.
划分数据集的大原则是:(We chose to split our dataset in a way that make our unorganized data more organized)将无序的数据变得更加有序。

1. 信息增益的计算

Claude Shannon(克劳德.香农)
Claude Shannon is considered one of the smartest people of the twentieth century. In William Poundstone’s 2005 book Fortune’s Formula, he wrote this of Claude Shannon: “There were many at Bell Labs and MIT who compared Shannon’s insight to Ein-stein’s. Others found that comparison unfair—unfair to Shannon.”

1. 信息(Information):

  1. 熵(entropy):
  2. 信息增益(Information Gain):
  3. 熵(entropy):

Measuring consistency in a dataset

Using resursion to construct a decision tree

Plotting tress in Matplotlib

机器学习--决策树

标签:基础上   算法   ann   with   HERE   相关   dataset   离散化   rtu   

原文地址:https://www.cnblogs.com/Mr0wang/p/9733835.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!