不归一化情况下不同算法对比
对不同分类算法在数据归一化,和不归一化的情况下,对比算法预测得分。总结不同算法的特点。
逻辑斯蒂回归算法分类效果
# 导包,一次把所有的包都导进来
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import warnings
warnings.filterwarnings('ignore')
# 加载数据
data=datasets.load_wine()
X=data['data']
y=data['target']
feature_names=data['feature_names']
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=LogisticRegression()
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 23.6 s
0.9484722222222285
支持向量机
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=SVC()
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 2.21 s
0.6722499999999996
决策树
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=DecisionTreeClassifier(criterion='entropy')
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 1.57 s
0.9227500000000051
归一化情况下不同算法对比
# 归一化X
standard=StandardScaler()
X=standard.fit_transform(X)
逻辑斯蒂回归
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=LogisticRegression()
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 8.21 s
0.981000000000007
支持向量机
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=SVC()
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 1.85 s
0.9836388888888954
决策树
%%time
score=0
for i in range(1000):
# 拆分数据
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
# 构建模型
model=DecisionTreeClassifier(criterion='entropy')
# 训练模型
model.fit(X_train,y_train)
# 预测得分
score += model.score(X_test,y_test)/1000
score
Wall time: 1.42 s
0.925611111111115
结论
- 归一化会提高逻辑斯蒂回归和支持向量机的预测分数
- 决策树预测结果对归一化不敏感
- 归一化会缩短逻辑斯蒂回归的运行时间
本文对比了在不归一化和归一化条件下,逻辑回归、支持向量机和决策树的分类效果。结果显示,归一化能提升逻辑回归和支持向量机的预测准确性和运行效率,但对决策树的影响较小。
646

被折叠的 条评论
为什么被折叠?



