Section I: Brief Introduction on Three Regression Models
Regulation is one approach to tackle the problem of overfitting by adding additional information, and thereby shrinking the parameter values of the model to induce a penalty against complexity. The most popular approaches to regularized linear regression are the so-called Ridge Regression, Least Absolute Shrinkage and Selection Operator(LASSO), AND Elastic Net Models.
- Ridge Regression: L2 Regulation
- LASSO Regression: L1 Regulation
- ElasticNet Regression: L2 and L1 Regulation
Two Quantitative Measures
- Mean Square Error(MSE)
- R2 Score - Standard Version of MSE
FROM
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.
第一部分:Ridge Regression
代码
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error,r2_score
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {
'family': 'Times New Roman',
'weight': 'light'}
plt.rc("font", **font)
#Section 1: Load data and split it into Train/Test dataset
price=datasets.load_boston()
X=price.data
y=price.target
X_train,X_test,y_train,y_test=train_test_split(X,y,
test_size=0.3)
#Section 2: Ridge Regression and Least Shrinkage and Selection Operator(LASSO) AND Elastic Net
#Ridge: L2 Regulation
#LASSO: L1 Regulation
#Elastic Net: Both L1 and L2 Regulation
#Section 2.1: Ridge Model
#The parameter alpha would be the regulation stength.
from sklearn.linear_model import Ridge
ridge=Ridge(alpha=1.0)
ridge.fit(X_train,y_train)
y_train_pred=ridge.predict(X_train)
y_test_pred=ridge.predict(X_test)
plt.scatter(y_train_pred,y_train_pred-y_train,
c='blue',marker='o',edgecolor='white',
label='Training Data')
plt.scatter(y_test_pred,y_test_pred-y_test,
c='limegreen',marker='s',edgecolors='white',
label='Test Data')
plt.xlabel("Predicted Values")
plt.ylabel("The Residuals")
plt.legend(loc='upper left')
plt.hlines(y=0,xmin=-10,xmax=50,color='black',lw=2)
plt.xlim([-10,50])
plt.title("Ridge Regression Model")
plt.savefig('./fig2.png')
plt.show()
print("\nMSE Train in Ridge: %.3f, Test: %.3f" % \
(mean_squared_er

该博客介绍了机器学习中防止过拟合的三种回归方法:Ridge回归(L2正则化)、LASSO回归(L1正则化)和ElasticNet回归(结合L1和L2正则化)。通过代码示例展示了每种回归模型的实现,并分别给出了预测精度。
最低0.47元/天 解锁文章
1万+

被折叠的 条评论
为什么被折叠?



