机器学习线性回归学习心得_线性回归为机器学习的初学者解释-优快云博客

本文分享了机器学习线性回归的学习心得，旨在为初学者提供一个易懂的入门指南。内容涵盖线性回归的基本概念及其在实际问题中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

机器学习线性回归学习心得

Data science with the kind of power it gives you to analyze each and every bit of data you have at your disposal, to make smart & intelligent business decisions, is becoming a must-have tool to understand and implement in your organization, it is very important that your business decisions are not based on intuition rather based on data analysis.

数据科学以其强大的功能使您能够分析您拥有的每一个数据，做出明智的业务决策，正成为了解和在您的组织中实施的必备工具，它非常重要的是您的业务决策不是基于直觉，而是基于数据分析。

Being a data science learner & practitioner, very often

经常成为数据科学学习者和实践者

我觉得： (I feel:)

“Data which you have in your repository is a gold mine, which needs to be harnessed with an intent to serve the humanity at large, as they are the key source of the same data. “

“您的存储库中的数据是一个金矿，需要利用它来为整个人类服务，因为它们是同一数据的关键来源。 “

Data has a story to tell. Being a data engineer and a business leader it’s your primary responsibility to treat them well, process it with an appropriate ML model, and build a solution that is relevant for both current and future user needs. With this intent, let’s begin our journey of understanding supervised ML using the Linear Regression model.

数据有故事可讲。作为数据工程师和业务主管，您的首要责任是妥善处理它们，使用适当的ML模型进行处理，并建立与当前和将来的用户需求相关的解决方案。出于这个目的，让我们开始使用线性回归模型了解监督ML的过程。

今日文章议程： (Agenda Of Today’s Article:)

What Is Supervised Machine Learning?
什么是有监督的机器学习？
Type Of Supervised Machine Learning?
监督机器学习的类型？
What Is Regression & Its Type?
什么是回归及其类型？
Understanding Linear Regression With Example?
通过示例了解线性回归？
Hands-On Labs Exercise On Linear Regression Using Python & Jupyter
动手练习使用Python和Jupyter进行线性回归

1.什么是监督学习？ (1. What Is Supervised Learning?)

监督学习： (Supervised Learning:)

In supervised learning, we are given a labeled data set(labeled training data) and the desired outcome is already known, where every pair of training data has some kind of relationship.

在监督学习中，我们得到了一个标记的数据集(标记的训练数据)，并且所需的结果是已知的，其中每对训练数据都有某种关系。

Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.

监督学习是您拥有输入变量(x)和输出变量(Y)，并使用算法学习从输入到输出的映射函数的地方。

Y = f(X)

The intent is to train the function so such an extent that whenever we have any new input data (x) you can easily predict the output variables (Y) for that given set of input data.

目的是对功能进行训练，以便在有任何新输入数据(x)时都可以轻松预测给定输入数据集的输出变量(Y)。

So here the training happens under the supervision of a teacher/assistant who already has the knowledge of correct answers and the algorithm iteratively makes predictions on the training data and is corrected by the supervisor. So when our learning algorithm achieves the acceptable level of training performance we put an end to the learning process.

因此，这里的培训是在已经具有正确答案知识的老师/助理的监督下进行的，并且该算法会反复对培训数据进行预测，并由主管进行纠正。因此，当我们的学习算法达到可接受的训练水平时，我们就结束了学习过程。

监督式机器学习的类型： (Types Of Supervised ML:)

The most fundamental way one can categorize any supervised learning methodology is based on the type of problem statement it is trying to solve. At the high level, we can also say, what kind of business problem one is trying to solve using Supervised Machine Learning algorithms.

可以将任何监督学习方法分类的最基本方法是基于它要解决的问题陈述的类型。从高层次上讲，我们还可以说，人们正在尝试使用监督式机器学习算法解决什么样的业务问题。

So, Within supervised machine learning we further categorize problems into the following categories:

因此，在有监督的机器学习中，我们将问题进一步分类为以下类别：

Regression
回归
Classification
分类

1.回归 (1. Regression)

Regression problems are the problems where we try to make a prediction on a continuous scale. Examples could be predicting the stock price of a company or predicting the temperature tomorrow based on historical data. Here temperature or sales parameters are continuous variables and we are trying to predict the change in sales value based on certain, given input variables like man-hours used, etc..

回归问题是我们试图连续进行预测的问题。例如，可以根据历史数据预测公司的股价或预测明天的气温。这里的温度或销售参数是连续变量，我们试图根据某些给定的输入变量(例如使用的工时等)来预测销售价值的变化。

Regression is a method of modeling a target value based on independent predictors. This method is mostly used for forecasting and finding out the cause and effect relationship between variables. Regression techniques, mostly differ based on the number of independent variables and the type of relationship between the independent and dependent variables.

回归是一种基于独立预测变量对目标值建模的方法。该方法主要用于预测和找出变量之间的因果关系。回归技术大多基于自变量的数量以及自变量和因变量之间的关系类型而有所不同。

Regression Types :

回归类型：

Linear Regression
线性回归
Multiple Linear Regression
多元线性回归
Polynomial Regression
多项式回归
Decision Tree Regression
决策树回归
Random Forest Regression
森林随机回归

We will cover only Linear regression today and the rest we will cover later.

今天我们将只讨论线性回归，其余的将在以后讨论。

什么是线性回归？ (What Is Linear Regression?)

It is made up of two words Linear & regression. Let’s understand both before we get into the definition of linear regression

它由线性和回归两个词组成。在进入线性回归的定义之前，让我们先了解一下

Linear: The word linear comes from the Latin word linearis, which means pertaining to or resembling a line

线性：线性一词来自拉丁语linearis，这意味着与线相似或相似

Regression: a kind statistical technique for estimating the relationships among dependent & independent variables.

回归：一种统计技术，用于估计因变量和自变量之间的关系。

Let’s combine them and define:

让我们结合起来并定义：

线性回归： (Linear Regression:)

It is a statistical approach to model between a dependent variable and one or more explanatory variables (or independent variables) to come up with a best fit linear line(linear equation, using least squared approach) represented in a most simplified manner as:

这是一种统计方法，可以在因变量和一个或多个解释变量(或自变量)之间建模，以最简化的方式表示最合适的线性线(线性方程式，使用最小二乘法)，其表示方式为：

Simple linear regression,

简单的线性回归

y =β0+β1X (y=β0+β1X)

X=explanatory variables,

X =解释变量，

β0=y-intercept (constant term),

β0