top of page

K-nearest Neighbors Classification with Multiple Labels

Introduction

To classify inputs using k-nearest neighbors (KNN) classification, it is necessary to have a labeled dataset, the distance metric that should be used to compute the distance between datapoints, and the value of k which is how many nearest neighbors should be reviewed.


K-Nearest Neighbors classification associates the class of neighboring datapoints to assign labels. When an unknown datapoint needs to be classified, the distance to all other datapoints is measured. Then, the k nearest neighbors are retrieved and based on which label is most common within this group, a predicted label is determined.


The k value is a parameter when generating the model. Below, see how selecting the right k value can impact the classification.





Image From: Murphy, K. P. (2021). Figure 1.15. *In Machine Learning: A Probabilistic Perspective*. textbook, MIT Press.

bottom of page