使用矩阵来简化算法
使用矩阵乘法来减少代码
训练集
size(feet2feet^2feet2) | Number of badrooms | Number of floors | Age of home(years) | Price($1000) |
---|---|---|---|---|
2014 | 5 | 1 | 45 | 460 |
1416 | 3 | 2 | 40 | 232 |
1534 | 3 | 2 | 30 | 315 |
852 | 2 | 1 | 36 | 178 |
… | … | … | … | … |
预测函数
hθ(x)=θ0+θ1x1+θ2x2+θ3x3+θ4x4h\theta(x)=\theta_{0}+\theta_{1}x_{1}+\theta_{2}x_{2}+\theta_{3}x_{3}+\theta_{4}x_{4}hθ(x)=θ0+θ1x1+θ2x2+θ3x3+θ4x4
使用矩阵
当特征数量为n时
hθ(x)=θ0+θ1x1+θ2x2+θ3x3+θ4x4+......+θnxnh\theta(x)=\theta_{0}+\theta_{1}x_{1}+\theta_{2}x_{2}+\theta_{3}x_{3}+\theta_{4}x_{4}+......+\theta_{n}x_{n}hθ(x)=θ0+θ1x1+θ2x2+θ3x3+θ4x4+......+θnxn
假设一个x0=1x_{0}=1x0=1$
x=[x0x1x2x3x4...xn]x=\begin{gathered}\begin{bmatrix}x_{0}\\x_{1}\\x_{2}\\x_{3}\\x_{4}\\...\\x_{n}\end{bmatrix}\quad\end{gathered}x=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡x0x1x2x3x4...xn⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤θ=[θ0θ1θ2θ3θ4...θn]\theta=\begin{gathered}\begin{bmatrix}\theta_{0}\\\theta_{1}\\\theta_{2}\\\theta_{3}\\\theta_{4}\\...\\\theta_{n}\end{bmatrix}\quad\end{gathered}θ=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡θ0θ1θ2θ3θ4...θn⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤
预测函数为
hθ(x)=θ⊤xh\theta(x)=\theta^\top xhθ(x)=θ⊤x
各主流语言都有相关的库来优化矩阵计算
梯度下降算法为
θj=θj−a1m∑i=1m(hθ(x(i))−y(i))xj(i)\theta_{j}=\theta_{j}-a\frac{1}{m}\sum_{i=1}^{m}(h\theta(x^{(i)})-y^{(i)})x_{j}^{(i)}θj=θj−am1i=1∑m(hθ(x(i))−y(i))xj(i)
其中xj(i)x_{j}^{(i)}xj(i)表示第 iii 组特征的第 jjj 个特征
特征缩放
特征缩放的目的是使特征保持在相似的取值范围,这样梯度下降算法能更快的收敛
x1=size(0−2000 feet2)x_{1}=size(0-2000\space feet^2)x1=size(0−2000 feet2)
x2=number of badrooms(1−5)x_{2}=number\space of\space badrooms(1-5)x2=number of badrooms(1−5)
特征范围差距特别大时,轮廓图将呈现椭圆形
均值归一化
xj=xj−标准差maxx_{j}=\frac{x_{j}-标准差}{max}xj=maxxj−标准差
列如x1=size−10002000x_{1}=\frac{size-1000}{2000}x1=2000size−1000,x2=badrooms−25x_{2}=\frac{badrooms-2}{5}x2=5badrooms−2
−0.5≤x1≤0.5-0.5\le x_{1}\le 0.5−0.5≤x1≤0.5, −0.5≤x2≤0.5-0.5\le x_{2}\le 0.5−0.5≤x2≤0.5