本文介绍多元线性回归算法及其代码实现
第1步: 数据预处理
导入库
import pandas as pd
import numpy as np
导入数据集
dataset = pd.read_csv('50_Startups.csv')
X = dataset.iloc[ : , :-1].values
Y = dataset.iloc[ : , 4 ].values
将类别数据数字化
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder = LabelEncoder()
X[: , 3] = labelencoder.fit_transform(X[ : , 3])
onehotencoder = OneHotEncoder(categorical_features = [3])
X = onehotencoder.fit_transform(X).toarray()
躲避虚拟变量陷阱
X = X[: , 1:]
拆分数据集为训练集和测试集
from sklearn.model_selection import trai