What are the advantages of logistic regression over decision trees?FAQ

本文探讨了决策树和逻辑回归两种机器学习算法在不同场景下的优劣,重点关注它们在处理决策边界、复杂函数拟合、过拟合风险以及简单性方面的区别。决策树适合数据特征边界平行于坐标轴的情况,而逻辑回归则能更灵活地捕捉非平行边界,避免过拟合,且在简单性上表现优异。

What are the advantages of logistic regression over decision trees?FAQ

The answer to "Should I ever use learning algorithm (a) over learning algorithm (b)" will pretty much always be yes. Different learning algorithms make different assumptions about the data and have different rates of convergence. The one which works best, i.e. minimizes some cost function of interest (cross validation for example) will be the one that makes assumptions that are consistent with the data and has sufficiently converged to its error rate.

Put in the context of decision trees vs. logistic regression, what are the assumptions made?

Decision trees assume that our decision boundaries are parallel to the axes, for example if we have two features (x1, x2) then it can only create rules such as x1>=4.5, x2>=6.5 etc. which we can visualize as lines parallel to the axis. We see this in practice in the diagram below.

So decision trees chop up the feature space into rectangles (or in higher dimensions, hyper-rectangles). There can be many partitions made and so decision trees naturally scale up to creating more complex (say, higher VC) functions - which can be a problem with over-fitting.

What assumptions does logistic regression make? Despite the probabilistic framework of logistic regression, all that logistic regression assumes is that there is one smooth linear decision boundary. It finds that linear decision boundary by making assumptions that the P(Y|X) of some form, like the inverse logit function applied to a weighted sum of our features. Then it finds the weights by a maximum likelihood approach. 

However people get too caught up on that... The decision boundary it creates is a linear* decision boundary that can be of any direction. So if you have data where the decision boundary is not parallel to the axes,

then logistic regression picks it out pretty well, whereas a decision tree will have problems.

So in conclusion,

  • Both algorithms are really fast. There isn't much to distinguish them in terms of run-time.
  • Logistic regression will work better if there's a single decision boundary, not necessarily parallel to the axis.
  • Decision trees can be applied to situations where there's not just one underlying decision boundary, but many, and will work best if the class labels roughly lie in hyper-rectangular regions.
  • Logistic regression is intrinsically simple, it has low variance and so is less prone to over-fitting. Decision trees can be scaled up to be very complex, are are more liable to over-fit. Pruning is applied to avoid this.


Maybe you'll be left thinking, "I wish decision trees didn't have to create rules that are parallel to the axis." This motivates support vector machines.

Footnotes:
* linear in your covariates. If you include non-linear transformations or interactions then it will be non-linear in the space of those original covariates.

转载于:https://www.cnblogs.com/yymn/p/4662396.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值