Section I: Brief Introduction on K-Nearest Neighbors
K-Nearest neighbors (KNN) is particularly interesting because it is fundamentallyndifferent from the other learning algorithms. KNN is a typical example of a lazy learner. It is called not because of its apparaent simplicity, but because it doesn’t learn a discriminative function from training data, but memorizes the training dataset instead. The KNN algorithm itself is fairly straightforward and can be summarized by the following steps:
- Step 1: Choose the number of k and a distance metric
- Step 2: Find the k-nearest neighbors of the sample
- Step 3: Assign the ckass label by majority vote
From
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.
Section II: Construct K-Nearest Neighbors Model
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from DecisionTrees.visualize_test_idx import plot_decision_regions
plt.rcParams['figure.dpi']=200
plt.rcParams['savefig.dpi']=200
font = {'family': 'Times New Roman',
'weight': 'light'}
plt.rc("font", **font)
#Section 1: Load data and split it into train/test dataset
iris=datasets.load_iris()
X=iris.data[:,[2,3]]
y=iris.target
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=1,stratify=y)
sc=StandardScaler()
sc.fit(X_train)
X_train_std=sc.transform(X_train)
X_test_std=sc.transform(X_test)
X_combined=np.vstack([X_train_std,X_test_std])
y_combined=np.hstack([y_train,y_test])
#Section 2: Train K-Neighbor Model
from sklearn.neighbors import KNeighborsClassifier
knn=KNeighborsClassifier(n_neighbors=5,p=2,metric='minkowski')
knn.fit(X_train_std,y_train)
plot_decision_regions(X_combined,y_combined,classifier=knn,test_idx=range(105,150))
plt.xlabel('petal length [standardized]')
plt.ylabel('petal width [standardized]')
plt.legend(loc='upper left')
plt.savefig('./fig1.png')
plt.show()
备注:
plot_decision_regions如果不作特别说明,均为机器学习-感知机(Perceptron)-Scikit-Learn中的plot_decision_regions函数,链接为:机器学习-感知机(Perceptron)-Scikit-Learn。
参考文献:
Sebastian Raschka, Vahid Mirjalili. Python机器学习第二版. 南京:东南大学出版社,2018.