机器学习学习笔记(八)—— 评估学习算法(解决高偏差/高方差问题)

博客围绕神经网络诊断展开,介绍了选择假设函数模型时数据集的划分及误差计算方法,阐述了高方差和高偏差的诊断方式,包括通过学习曲线判断及确定是否增加训练集数量,还说明了自动学习正则化参数 lambda 的步骤,最后总结了不同问题的解决策略及神经网络参数与层数的选择要点。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

选择多少次方的多项式作为假设函数模型:

In order to choose the model of your hypothesis, you can test each degree of polynomial and look at the error result:

One way to break down our dataset into the three sets is:

  • 训练集Training set: 60%
  • 交叉验证集Cross validation set: 20%
  • 测试集Test set: 20%

We can now calculate three separate error values for the three different sets using the following method:

  1. Optimize the parameters in Θ using the training set for each polynomial degree.
  2. Find the polynomial degree(多项式等级) d with the least error using the cross validation set.
  3. Estimate the generalization error using the test set with Jtest(Θ(d)), (d = theta from polynomial with lower error);

This way, the degree of the polynomial d has not been trained using the test set.

注:计算Jtrain/Jcv/Jtest时lambda都为0.

 

诊断高方差(variance)和高偏差(bias):

High bias is underfitting and high variance is overfitting. Ideally, we need to find a golden mean between these two.

High bias (underfitting): both Jtrain(Θ) and JCV(Θ) will be high. Also, JCV(Θ)≈Jtrain(Θ).

High variance (overfitting): Jtrain(Θ) will be low and JCV(Θ) will be much greater than Jtrain(Θ).

The is summarized in the figure below:

 

通过学习曲线(Learning Curves)诊断高方差高偏差

Experiencing high bias:

Low training set size: causes Jtrain(Θ) to be low and JCV(Θ) to be high.

Large training set size: causes both Jtrain(Θ) and JCV(Θ) to be high with Jtrain(Θ)≈JCV(Θ).

If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much.

Experiencing high variance:

Low training set size: Jtrain(Θ) will be low and JCV(Θ) will be high.

Large training set size: Jtrain(Θ) increases with training set size and JCV(Θ) continues to decrease without leveling off. Also, Jtrain(Θ) < JCV(Θ) but the difference between them remains significant.

If a learning algorithm is suffering from high variance, getting more training data is likely to help.

通过学习曲线也可看出是否需要增加训练集数量。

 

自动学习正则化参数lambda:

In the figure above, we see that as \lambdaλ increases, our fit becomes more rigid. On the other hand, as \lambdaλ approaches 0, we tend to over overfit the data. So how do we choose our parameter \lambdaλ to get it 'just right' ? In order to choose the model and the regularization term λ, we need to:

 

  1. Create a list of lambdas (i.e. λ∈{0,0.01,0.02,0.04,0.08,0.16,0.32,0.64,1.28,2.56,5.12,10.24});
  2. Create a set of models with different degrees or any other variants.
  3. Iterate through the \lambdaλs and for each \lambdaλ go through all the models to learn some Θ.
  4. Compute the cross validation error using the learned Θ (computed with λ) on the JCV(Θ) without regularization or λ = 0.
  5. Select the best combo that produces the lowest error on the cross validation set.
  6. Using the best combo Θ and λ, apply it on Jtest(Θ) to see if it has a good generalization of the problem.

总结:

Our decision process can be broken down as follows:

  • Getting more training examples: Fixes high variance
  • Trying smaller sets of features: Fixes high variance
  • Adding features: Fixes high bias
  • Adding polynomial features: Fixes high bias
  • Decreasing λ: Fixes high bias
  • Increasing λ: Fixes high variance.

 

Diagnosing Neural Networks

  • A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
  • A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

Using a single hidden layer is a good starting default. You can train your neural network on a number of hidden layers using your cross validation set. You can then select the one that performs best.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值