机器学习建模与评估
知识点: 1.数据集的划分 2.机器学习模型建模的三行代码 3.机器学习模型分类问题的评估 今日代码比较多,但是难度不大,仔细看看示例代码,好好理解下这几个评估指标。
作业:尝试对心脏病数据集采用机器学习模型建模和评估
1、数据集的划分
import pandas as pd
data = pd.read_csv(r'C:\Users\许兰\Desktop\打卡文件\python60-days-challenge-master\heart.csv')
from sklearn.model_selection import train_test_split
X = data.drop(['target'], axis = 1)
Y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42)
print(f"训练集形状={X_train.shape},测试集形状={X_test.shape}")
过程:读取文件后,用sklearn中的方法划分训练集和测试集,x,y分别是特征和标签 ,将两者随机划分,并写下随机种子固定划分的数据。
2、机器学习的三行代码
svm_model = SVC(random_state = 42)
svm_model.fit(X_train, y_train)
svm_pred = svm_model.predict(X_test)
过程:模型实例化,模型训练需要带入训练集,然后进行模型预测需要带入测试集。
3、机器学习分类模型的评估
import pandas as pd
data = pd.read_csv(r'C:\Users\许兰\Desktop\打卡文件\python60-days-challenge-master\heart.csv')
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score, f1_score
X = data.drop(['target'], axis = 1)
Y = data['target']
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 42)
print(f"训练集形状={X_train.shape},测试集形状={X_test.shape}")
from sklearn.svm import SVC
svm_model = SVC(random_state = 42)
svm_model.fit(X_train, y_train)
svm_pred = svm_model.predict(X_test)
print("\nSVM 分类报告:")
print(classification_report(y_test, svm_pred))
print("SVM 混淆矩阵:")
print(confusion_matrix(y_test, svm_pred))
svm_accuracy = accuracy_score(y_test, svm_pred)
svm_precision = precision_score(y_test, svm_pred)
svm_recall = recall_score(y_test, svm_pred)
svm_f1 = f1_score(y_test, svm_pred)
print("SVM 模型评估指标:")
print(f"准确率:{svm_accuracy:.4f}")
print(f"精确率:{svm_precision:.4f}")
print(f"召回率:{svm_recall:.4f}")
我只做了支持向量机这个模型的评估,但是其他模型和其处理流程几乎完全相同,不再赘述。
结果如下:
SVM 混淆矩阵:
[[15 14]
[ 4 28]]
SVM 模型评估指标:
准确率:0.7049
精确率:0.6667
召回率:0.8750
这里@浙大疏锦行
378

被折叠的 条评论
为什么被折叠?



