21Fall\ 单变量线性回归

最新推荐文章于 2025-05-08 19:50:20 发布

Bealliant

最新推荐文章于 2025-05-08 19:50:20 发布

阅读量97

点赞数

分类专栏： 21Winter 大一上文章标签：机器学习

本文链接：https://blog.youkuaiyun.com/qq_42198383/article/details/121995186

版权

21Winter 大一上专栏收录该内容

10 篇文章

订阅专栏

本文深入浅出地介绍了梯度下降算法的基本原理及其在机器学习中的应用。通过简化问题到只有一个参数的情况来直观理解梯度下降如何工作及更新步骤背后的含义。文章还探讨了学习率的选择对算法收敛的影响，并最终实现了线性回归这一机器学习算法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Notes taken in the course Machine Learning by Andrew Ng.

intuition about Gradient Descent: How the algorithm works & why the updating step makes sense

to know how the formula works, again we reduce the original problem to a simplified problem with only one parameter.

𝑎0≔𝑎0− 𝛼𝑑𝐽(𝑎0)𝑑𝑎0 (𝑗=0)

𝑑𝐽(𝑎0)𝑑𝑎0 means the partial derivative of one point, and has its geometrical meaning - the slope of the tangent to the point.

If alpha is too small, gradient descent may be low; while alpha is too large, it may overshoot the minimum, making it fail to converge, or may diverge.

If you have already reached the local optimum, the derivative term will be 0 so you won't take steps any more.

Remember the magnitude of the step you take is both related to the Learning Rate and the derivative value of the last point. So when you are stepping closer to the minimum point, you will automatically take smaller steps.

Finally, if we put the Cost Function and Gradient Descent together, we will accomplish the first Learning Algorithm - Linear Regression.

that is finally how we realized the algorithm.

Now we look back to the original problem.

In a one-parameter function f(x_1), the graph of function f is just a 2-D curve. In a two-parameter function, however, the graph of the function is a 3-D curved surface with three axes - we call it axis x, axis y and axis z.

we assume that the coordinate of the point P is (𝑥0,𝑦0,𝑧0) and the vertical axis, z represents the value of the function J(x,y). So it's not too hard to understand the meaning of the partial derivative. The derivative to x is in the (x,𝑦0,z) surface, and the derivative to y is in the (𝑥0,𝑦,𝑧) surface.

In this 3-D surface, you can imagine you are going down a real hill and should decide which direction to go.

The 𝒂𝟏𝒂𝒏𝒅 𝒂𝟐 should update every time so that you find the right point (𝒂𝟏 , 𝒂𝟐) in the horizontal surface corresponding to the minimum J.