1.1 本周内容——诊断程序(Diagnostic)
Diagnostic: A test that you run to gain insight into what is/isn’t working with a learning algorithm, to gain guidance into improving its performance.
Diagnostics can take time to implement but doing so can be a very good use of your time.
1.2 模型评估(Evaluating a model)
对于线性回归方法:
将数据集拆分成两部分:训练集(Training set)、测试集(Test set);
通过使代价函数最小化来训练参数:
J ( w ⃗ , b ) = min w ⃗ , b [ 1 2 m t r a i n ∑ i = 1 m t r a i n ( f w ⃗ , b ( x ⃗ ( i ) ) − y ( i ) ) 2 + λ 2 m t r a i n ∑ j = 1 n w j 2 ] J\left( {\vec w,b} \right) = \mathop {\min }\limits_{\vec w,b} \left[ {\frac{1}{
{2{m_{train}}}}\sum\limits_{i = 1}^{
{m_{train}}} {
{
{\left( {
{f_{\vec w,b}}\left( {
{
{\vec x}^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)}^2} + \frac{\lambda }{
{2{m_{train}}}}\sum\limits_{j = 1}^n {
{w_j}^2} } } \right] J(w,b)=w,bmin[2mtrain1i=1∑mtrain(fw,b(x(i))−y(i))2+2mtrainλj=1∑nwj2]
计算测试集的误差:
J t e s t ( w ⃗ , b ) = 1 2 m t e s t [ ∑ i = 1 m t e s t ( f w ⃗ , b ( x ⃗ t e s t ( i ) ) − y t e s t ( i ) ) 2 ] {J_{test}}\left( {\vec w,b} \right) = \frac{1}{
{2{m_{test}}}}\left[ {\sum\limits_{i = 1}^{
{m_{test}}} {
{
{\left( {
{f_{\vec w,b}}\left( {\vec x_{test}^{\left( i \right)}} \right) - y_{test}^{\left( i \right)}} \right)}^2}} } \right] Jtest(w,b)=2mtest1[i=1∑mtest(fw,b(xtest(i))−ytest(i))2]
计算训练集的误差:
J t r a i n ( w ⃗ , b ) = 1 2 m t r a i n [ ∑ i = 1 m t r a i n ( f w ⃗ , b ( x ⃗ t r a i n ( i ) ) − y t r a i n ( i ) ) 2 ] {J_{train}}\left( {\vec w,b} \right) = \frac{1}{
{2{m_{train}}}}\left[ {\sum\limits_{i = 1}^{
{m_{train}}} {
{
{\left( {
{f_{\vec w,b}}\left( {\vec x_{train}^{\left( i \right)}} \right) - y_{train}^{\left( i \right)}} \right)}^2}} } \right] Jtrain