吴恩达机器学习 课堂笔记 Chapter8 正则化(Regularization)

本文探讨了机器学习中过拟合的问题,介绍了减少特征数量、模型选择算法和正则化三种解决方法。详细讲解了正则化在逻辑回归和线性回归中的应用,包括成本函数、梯度下降和正规方程等关键概念。

The problem of overfitting

If we have too many features, the learned hypothesis may fit the training set very well, but fail to generalize to new examples.

  • Underfitting => high bias
  • Overfitting => high variance

Methods for addressing overfitting

  • Reduce number of features
    Manually choose which features to keep => Drop some information meanwhile.
    Model selection algorithm
  • Regularization
    Keep all the features, but reduce magnitude/values of parameter θ j \theta_j θj
    Works well when we have lots of features, each of which contributes a bit to predicting y.

Cost function

Intuition

Small values for parameters θ 0 , θ 1 , . . . , θ n \theta_0, \theta_1, ..., \theta_n θ0,θ1,...,θn

  • “Simpler” hypothesis
  • Less prone to overfitting
  • We cannot know in advance which ones to pick. So we ask for every parameter to be small.

Cost function

J ( θ ) = 1 2 m ( ∑ i = 0 m ( h θ ( x ( i ) − y i ) 2 ) + λ ∑ i = 1 n θ j 2 ) J(\theta) = \frac{1}{2m}(\sum_{i=0}^{m}(h_\theta(x^{(i)}-y^{i})^2) + \lambda \sum_{i=1}^n\theta_j^2) J(θ)=2m1(i=0m(hθ(x(i)yi)2)+λi=1nθj2)
NOTE:
We do not penalize θ 0 \theta_0 θ0.
λ \lambda λ:regularization parameter. Need to choose it well.

Regularized linear regression

Gradient descent

在这里插入图片描述

Normal equation

在这里插入图片描述

Regularized logistic regression

Cost function

J ( θ ) = − 1 m ∑ ( y l o g h θ ( x ) ( 1 − y ) l o g ( 1 − h θ ( x ) ) ) + 1 2 m ∑ i = 1 n θ j 2 J(\theta)=-\frac{1}{m}\sum(ylogh_\theta(x)(1-y)log(1-h_\theta(x)))+\frac{1}{2m}\sum_{i=1}^n\theta_j^2 J(θ)=m1(yloghθ(x)(1y)log(1hθ(x)))+2m1i=1nθj2

Gradient descent

Same as above.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值