python多元非线性回归_day-13 python库实现简单非线性回归应用

本文介绍了Python中使用Logistic Regression解决非线性回归问题,包括问题引入、逻辑回归的推导过程,以及实际编程应用示例,展示了如何通过梯度下降算法更新参数。

一、概率

在引入问题前,我们先复习下数学里面关于概率的基本概念

概率:对一件事发生的可能性衡量

范围:0<=P<=1

计算方法:根据个人置信区间;根据历史数据;根据模拟数据。

条件概率:B发生的条件下,A发生的概率

1358931-20180503002527339-430695806.png

二、Logistic Regression(逻辑回归)

1、问题引入

处理二值数据时,如果一直8个测试数据集为如下所示,我们利用线性回归方程,建立回归方程曲线,图形显示,并不能很好的模拟回归问题,也就是我们所说的欠回归。

1358931-20180503002605613-1696871015.png

如果继续引入第9个测试点,我们发现欠回归情况更加严重,因此我们需要引入一个新的回归模型,来解决该类模型欠回归问题。

1358931-20180503002621726-2077543099.png

2、简单推导过程

假设测试数据为X(x0,x1,x2···xn)

要学习的参数为:Θ(θ0,θ1,θ2,···θn)

向量表示:

以下是一个kaggle共享单车案例的回归分析的python代码,包括广义线性模型、非线性回归、可加模型、变系数模型及其衍生模型的实现: ```python # 导入必要的 import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from sklearn.linear_model import LinearRegression, Ridge, Lasso from sklearn.pipeline import Pipeline from sklearn.preprocessing import PolynomialFeatures from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error, r2_score # 加载数据 data = pd.read_csv("Bike-Sharing-Dataset/day.csv") # 特征工程 data = data.drop(["instant", "dteday", "casual", "registered"], axis=1) data = pd.get_dummies(data, columns=["season", "yr", "mnth", "holiday", "weekday", "workingday", "weathersit"]) # 划分训练集和测试集 X_train, X_test, y_train, y_test = train_test_split(data.drop(["cnt"], axis=1), data["cnt"], test_size=0.2, random_state=42) # 线性回归模型 lr = LinearRegression() lr.fit(X_train, y_train) y_train_pred = lr.predict(X_train) y_test_pred = lr.predict(X_test) print("Linear Regression Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Linear Regression Train R2:", r2_score(y_train, y_train_pred)) print("Linear Regression Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Linear Regression Test R2:", r2_score(y_test, y_test_pred)) # 岭回归模型 ridge = Ridge(alpha=0.1) ridge.fit(X_train, y_train) y_train_pred = ridge.predict(X_train) y_test_pred = ridge.predict(X_test) print("Ridge Regression Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Ridge Regression Train R2:", r2_score(y_train, y_train_pred)) print("Ridge Regression Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Ridge Regression Test R2:", r2_score(y_test, y_test_pred)) # Lasso回归模型 lasso = Lasso(alpha=0.1) lasso.fit(X_train, y_train) y_train_pred = lasso.predict(X_train) y_test_pred = lasso.predict(X_test) print("Lasso Regression Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Lasso Regression Train R2:", r2_score(y_train, y_train_pred)) print("Lasso Regression Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Lasso Regression Test R2:", r2_score(y_test, y_test_pred)) # 多项式回归模型 polynomial_features = PolynomialFeatures(degree=2) linear_regression = LinearRegression() pipeline = Pipeline([("polynomial_features", polynomial_features), ("linear_regression", linear_regression)]) pipeline.fit(X_train, y_train) y_train_pred = pipeline.predict(X_train) y_test_pred = pipeline.predict(X_test) print("Polynomial Regression Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Polynomial Regression Train R2:", r2_score(y_train, y_train_pred)) print("Polynomial Regression Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Polynomial Regression Test R2:", r2_score(y_test, y_test_pred)) # 可加模型 additive_model = LinearRegression() additive_model.fit(X_train[["temp", "atemp"]], y_train) y_train_pred = additive_model.predict(X_train[["temp", "atemp"]]) y_test_pred = additive_model.predict(X_test[["temp", "atemp"]]) print("Additive Model Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Additive Model Train R2:", r2_score(y_train, y_train_pred)) print("Additive Model Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Additive Model Test R2:", r2_score(y_test, y_test_pred)) # 变系数模型 data["season_cnt"] = data["spring"] * data["cnt_spring"] + data["summer"] * data["cnt_summer"] + data["fall"] * data["cnt_fall"] + data["winter"] * data["cnt_winter"] data["year_cnt"] = data["yr_0"] * data["cnt_yr_0"] + data["yr_1"] * data["cnt_yr_1"] data["month_cnt"] = data["mnth_1"] * data["cnt_mnth_1"] + data["mnth_2"] * data["cnt_mnth_2"] + data["mnth_3"] * data["cnt_mnth_3"] + data["mnth_4"] * data["cnt_mnth_4"] + data["mnth_5"] * data["cnt_mnth_5"] + data["mnth_6"] * data["cnt_mnth_6"] + data["mnth_7"] * data["cnt_mnth_7"] + data["mnth_8"] * data["cnt_mnth_8"] + data["mnth_9"] * data["cnt_mnth_9"] + data["mnth_10"] * data["cnt_mnth_10"] + data["mnth_11"] * data["cnt_mnth_11"] + data["mnth_12"] * data["cnt_mnth_12"] data["workingday_cnt"] = data["workingday_0"] * data["cnt_workingday_0"] + data["workingday_1"] * data["cnt_workingday_1"] data["weathersit_cnt"] = data["weathersit_1"] * data["cnt_weathersit_1"] + data["weathersit_2"] * data["cnt_weathersit_2"] + data["weathersit_3"] * data["cnt_weathersit_3"] X_train, X_test, y_train, y_test = train_test_split(data.drop(["cnt", "season_cnt", "year_cnt", "month_cnt", "workingday_cnt", "weathersit_cnt"], axis=1), data[["cnt", "season_cnt", "year_cnt", "month_cnt", "workingday_cnt", "weathersit_cnt"]], test_size=0.2, random_state=42) variable_coef_model = LinearRegression() variable_coef_model.fit(X_train, y_train) y_train_pred = variable_coef_model.predict(X_train) y_test_pred = variable_coef_model.predict(X_test) print("Variable Coefficient Model Train MSE:", mean_squared_error(y_train, y_train_pred)) print("Variable Coefficient Model Train R2:", r2_score(y_train, y_train_pred)) print("Variable Coefficient Model Test MSE:", mean_squared_error(y_test, y_test_pred)) print("Variable Coefficient Model Test R2:", r2_score(y_test, y_test_pred)) ``` 这个代码使用了Scikit-learn中的多种回归模型,并且进行了特征工程和模型评估。其中,岭回归和Lasso回归是正则化模型,可以防止过拟合。多项式回归模型可以拟合非线性关系。可加模型和变系数模型是一些特殊的回归模型,可以用于解决一些特殊的问题。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值