K-nearest neighbors and Euclidean Distance

本文探讨了K-最近邻(KNN)算法在机器学习分类中的应用,解释了K值选择的重要性及其对预测准确性和置信度的影响,并介绍了作为算法核心的欧氏距离计算方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

K-nearest neighbors and Euclidean Distance

This is my study notes in machine learning, writing articles in English because I want to improve my writing skills. Anyway, thanks for watching and if I made some mistakes, let me know please.

What is K-nearest neighbors algorithm?

It is a supervised learning algorithm in classification. We have prior-labeled data for training, telling the machine which data belongs to which group. Clustering is the other algorithm in classification but it is unsupervised learning method.
The algorithm bases on distances between predicted data and trained data which are knew before. Distance is also understood as proximity intuitively.

What the K and nearest means?

K is a number we can choose, which symbols how many data points we choose that nearest to the new data. Usually we want to use an odd number as K because this algorithm is going to basically go to a majority vote based on the neighbors. If we use even number, we may get into trouble of 50/50 split situation. There many ways to apply weights to distance to penalize greater distances data, we may use even number for K.

Accuracy or Confidence

During prediction process, the algorithm will select K points which closest to the new data point, and then, find out the largest categories(classification), it means probably this data belongs to this group. The rate of ’positive number / K’ stands for confidence, which means how much we can trust this data belongs to this group. Accuracy is used in testing model after training. They are completely different.

Euclidean Distance

coming soon

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值