1. KNN分类算法
- 预备知识:KD-Tree算法 (KDimensional Tree)
- 在空间中寻找与目标点距离最近的k个点
- from sklearn.neighbors import NearestNeighbors
- n_neighbors 为查询的临近点个数
- algorithm 为查询算法
- ‘ball_tree’ will use BallTree
- ‘kd_tree’ will use KDTree
- ‘brute’ will use a brute-force search.
- ‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit method.
- radius 为查询半径
- p 为闵可夫斯距离的p值
from sklearn import datasets
data = datasets.load_iris()
X_data = data["data"]
Y_data = data["target"]
NN.fit(X_data) # 训练模型
result = NN.kneighbors(X =[[5.2, 3.1, 1.4, 0.2]] ,n_neighbors = 5,return_distance = True)
result[0] # 距离
result[1] # 索引
# ————KNN分类算法
"""
算法简介:https://www.cnblogs.com/jyroy/p/9427977.html
"""
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
features = pd.read_excel("./data.xlsx",sheet_name = "features",headers = 0)
label = pd.read_excel("./data.xlsx",sheet_name = "label",headers = 0)
# 训练集、验证集、测试集拆分
from sklearn.model_selection import train_test_split
X_tt,X_validation,Y_tt,Y_validation = train_test_split(features,label,test_size = 0.2)
X_train,X_test,Y_train,Y_test = train_test_split(X_tt,Y_tt,test_size = 0.25)
# 创建KNN分类模型对象
knn = KNeighborsClassifier(n_neighbors = 3)
knn_5 = KNeighborsClassifier(n_neighbors = 5)
# 使用训练集数据训练模型
knn.fit(X_test,Y_test)
knn_5.fit(X_test,Y_test)
# 使用模型对训练集和验证集数据进行预测
Y_validation_predict = knn.predict(X_validation)
Y_validation_predict_5 = knn_5.predict(X_validation)
Y_test_predict = knn.predict(X_test)
Y_test_predict_5 = knn_5.predict(X_test)
# 模型效果评判
"""
1、精准度:precision_score 指被分类器判定正例中的正样本的比重
2、准确率:accuracy_score 代表分类器对整个样本判断正确的比重。
3、召回率:recall_score 指的是被预测为正例的占总的正例的比重
4、f1_score 它是精确率和召回率的调和平均数,最大为1,最小为0
"""
from sklearn.metrics import f1_score,precision_score,accuracy_score,recall_score
def metrics_wj(x,y,title):
print("*"*8,title,"*"*8)