regression: output a scalar
example:
Stock Market Forecast——output: stock price tomorrow
Self-driving Car——input : 各个sensor的信息;output: 方向盘角度
Recommendation——output: 购买可能性
Estimating the Combat Power(CP) of a pokemon after evolution——
step1:Model
choose a model: y = w*xcp + b;
linear model: ;
(feature); wi:weight;b:bias
step2:Goodness of Function
Loss function L(input: a function, output: how bad is it)
其中f(x)为预测结果
step3: Best function
gradient descent 梯度下降 (求解最优化问题)
1.随机找一个初始位置w0 Randomly pick an initial value w;
2.计算w0处损失函数的微分(切线斜率)Compute
3.调整移动w使loss减小: learning rate η大学习快,反之学习慢
若为负 negative -> Increase w
若为正 positive -> Decrease w
若为0 卡住了,停在local minima
stuck at saddle point(鞍点) / local minima(局部极小值)
linear regression问题的损失函数一般为凸函数,无saddle point 和local minima
Formulation of and
model seclection
模型并非越复杂越好,虽然越复杂训练集的损失会越小,但会出现overfiting(过拟合)的问题,test的损失会很大,没有意义。
redesign the model
增加种类特征,仍然是linear model
other features?
看上去其他因素对 CP after evolution 的影响不大
考虑用多项式模型进行拟合,发现模型越复杂training loss越小,但test loss反而可能会增大
regularization(正则化)
smaller means smoother function(更平滑:输入变化小则输出变化小),so the functions with smaller
are better. We believe smoother function is more likely to be correct.
No need to apply regularization on bias——bias 不影响曲线的平滑程度
λ是需要自己调的,λ越大,function 越平滑,error可能会变大