本集内容
1. 偏差/方差(Bias/variance)
2. 经验风险最小化(empirical risk minimization (ERM))
3. 联合界引理union bound/Hoeffding不等式
4. 一致收敛(uniform convergence)
开篇Andrew Ng[大概]: To me what really separeates the people that really understand and really get machine learning compared to people that maybe read textbook and so they'll work through the math will be what you do next. When you apply a support vector machine and it doesn't quite do what you wanted, do you understand enough about svm to know what to do next and how to modify the algorithm? And to me that's often what really separates the great people in machine learning versus the people that like read the text book and so they'll the math and so they'll have just understood that. 希望以后用这个标准来衡量自己吧!
偏差/方差折中
再次考虑线性回归的例子,对于一批样本,我们用不同的特征维度拟合出不同的模型,如下图:
最左边的是一次函数在前面我们介绍过,属于欠拟合,最右边的拟合的是5阶函数,属于过拟合。这两个模型对于训练样本外的点预测效果都不好,一般误差都很大,一般误差是指一个假设在样本中预测出错的情况,这里的样本包括训练样本,还有训练集外的样本。对于欠拟合情况,我们可以说该算法有高偏差(低方差),因为它不能明显的拟合出数据原有的规律。对于过拟合情况,我们可以说该算法有高方差(低偏差),因为该算法拟合出了一些奇怪的规律。一个可行的方法就是在两个极端例子之间折中,即中间的图,拟合成二阶的函数。(这里并没有介绍关于偏差、方差的形式化定义,大概的直观理解一下)