机器学习--决策树

时间：2018-10-01 01:10:11 阅读：261 评论：0 收藏：0 [点我收藏+]

标签：基础上算法 ann with HERE 相关 dataset 离散化 rtu

Decision Tree

Decision Tree

Pre:

如下图所示，决策树包含判断模块、终止模块。其中终止模块表示已得出结论。

相较于KNN，决策树的优势在于数据的形式很容易理解。

相关介绍

奥卡姆剃刀原则: 切勿浪费较多的东西，去做‘用较少的的东西，同样可以做好的事情’。
启发法:(heuristics,策略法)是指依据有限的知识（不完整的信心）在短时间内找到解决方案的一种技术。
ID3算法:(Iterative Dichotomiser3 迭代二叉树3代) 这个算法是建立在奥卡姆剃刀原则的基础上：越是小型的决策树越优于大的决策树（简单理论）。

Tree construction

General approach to decison trees

Collect : Any
Prepare : This tree-building algorithm works only on nominal values(标称型数据), so any continuous values will need to quantized（离散化）.
Analyze ：Any methods, need to visually inspect the tree after it is built.
Train : Construct a tree data structure.
Test : Calcuate the error rate with the learned tree
use : This can be used in any supervised learning task, often, trees used to better understand the data

——《Machine Learning in Action》

Information Gain 信息增益

信息增益：在划分数据之前之后信息发生的变化.
划分数据集的大原则是：(We chose to split our dataset in a way that make our unorganized data more organized)将无序的数据变得更加有序。

1. 信息增益的计算

Claude Shannon(克劳德.香农)
Claude Shannon is considered one of the smartest people of the twentieth century. In William Poundstone’s 2005 book Fortune’s Formula, he wrote this of Claude Shannon: “There were many at Bell Labs and MIT who compared Shannon’s insight to Ein-stein’s. Others found that comparison unfair—unfair to Shannon.”

1. 信息（Information）:

熵(entropy):

信息增益(Information Gain):

熵(entropy):

Measuring consistency in a dataset

Using resursion to construct a decision tree

Plotting tress in Matplotlib

机器学习--决策树

标签：基础上算法 ann with HERE 相关 dataset 离散化 rtu

原文地址：https://www.cnblogs.com/Mr0wang/p/9733835.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行