Hands-On Machine Learning with Scikit-Learn & TensorFlow Exercise Q&A Chapter06

本文是一篇关于机器学习实战中决策树的问答集锦,涉及了决策树的深度、节点纯度、过拟合与欠拟合的解决策略、训练时间复杂度以及预排序对训练速度的影响。内容包括:决策树的深度与实例数量的关系,节点基尼不纯度与父节点的比较,如何应对过拟合和欠拟合,以及大规模数据训练时间的预测。最后提到了在moon数据集上训练和优化决策树以及构建森林的方法。

Q1. What is the approximate deph of a Decision Tree trained (without restrictions) on a training set with 1 million instances?

A1: The depth of a well-balanced binary tree containing m leaves is equal to log_{2}(m), so when there is 1 million instances the approximate depth is log_{2}(10^{6})\approx20.

 

Q2. Is a node's Gini impurity generally lower or greater than its parent's? Is it generally lower/greater, or always lower/greater?

A2: Generally lower, but not always lower. Consider A, B, A, A, A, Gini impurity = 1 - 1^{2}/5 -

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值