Decision Trees - 决策树

最新推荐文章于 2025-09-07 04:26:44 发布

806A

最新推荐文章于 2025-09-07 04:26:44 发布

阅读量616

点赞数

分类专栏：机器学习文章标签：机器学习模式识别

机器学习专栏收录该内容

8 篇文章

订阅专栏

本文介绍了决策树分类器的设计过程，包括如何选择根节点及后续分支。重点在于通过信息增益来选择最优属性，实现数据集的有效划分，最终形成一个尽可能小且准确的决策树模型。

Design Decision Tree Classifier

-Picking the root node

-Recursively branching

qPicking the root node

-The goal is to have the resulting decision tree as small as possible

(决策树要尽量的小)

-The main decision in the algorithm is the selection of the next attribute to condition on (start from the root node).

- We want attributes that split the examples to sets that are relatively pure in one label; this way we are closer to a leaf node.

（产生的孩子节点要尽量的纯也就是尽量只包含同一类别，这样跟更接近叶子节点，当节点中只包含同一类别的样本时此节点为叶子节点，不再分裂）

-The most popular heuristics is based on information gain, originated with the ID3 system of Quinlan.

（节点的分裂要依据信息增益（information gain），选择导致信息增益值比较大的属性进行分裂。）

Entropy（熵）

-Entropy measures the impurity of S

Information Gain（信息增益）

-Gain (S, A) = expected reduction in entropy due to sorting on A

-Values (A) is the set of all possible values for attribute A, Sv is the subset of S which attribute A has value v, |S| and | Sv | represent the number of samples in set S and set Sv respectively

-Gain(S,A) is the expected reduction in entropy caused by knowing the value of attribute A.

Example

Play Tennis Example

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。