k-nearest neighbors algorithm
Kaushik das
?? Frontend Developer | React.js, Redux, Tailwind CSS, Jest | Built Scalable Interfaces for Healthcare & Automotive (Toyota) | 30% Faster Load Times | Open to Remote & Onsite Roles
In this article we will learn about k-nearest neighbors algorithm. Here is the contents mentioned below
1.What is k-nearest neighbors algorithm?
2.Why we should use this algorithm?
3.How to use this algorithm?
1.What is k-nearest neighbors algorithm?
The KNN Algorithm assumes that similar things exist in close propinquity or similar things are near to other.
In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression.[1] In both cases, the input consists of the k closest training examples in the feature space. The output depends on whether k-NN is used for classification or regression:
- In k-NN classification, the output is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.
- In k-NN regression, the output is the property value for the object. This value is the average of the values of k nearest neighbors.Wikipedia
2.Why we should use this algorithm?
*NN is pretty intuitive and simple
*K-NN has no assumptions
*It constantly evolves
*Very easy to implement for multi-class problem
*Can be used both for Classification and Regression
3.How to use this algorithm?
We will discuss three simple steps to use this algorithm with the help of example.
We will take example to find the pass or fail in the exam from different data of student from CS and Maths Subject
1.Calculate Euclidean Distance
2.Get NN
3.Make Prediction
1.Calculate Euclidean Distance
In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space.wikipedia
Formula:
We will take this data set
dataset = [ [2.7810836,2.550537003,0], [1.465489372,2.362125076,0], [3.396561688,4.400293529,0], [1.38807019,1.850220317,0], [3.06407232,3.005305973,0], [7.627531214,2.759262235,1], [5.332441248,2.088626775,1], [6.922596716,1.77106367,1], [8.675418651,-0.242068655,1], [7.673756466,3.508563011,1] ]
Function define for calculation of Euclidean Distance
def Euclidean_distance(row1, row2): distance = 0 for i in range(len(row1)-1): distance += (row1[i] - row2[i])**2 return sqrt(distance) test = [8.675418651, 2.088626775,1] for i in dataset: dis = Euclidean_distance(test, i) print(dis)
Output
5.912406172801237 7.215114796649554 5.762823441453197 7.291247165694985 5.68572848441368 1.244114143007722 3.342977402999999 1.7813565228427415 2.33069543 1.7376841045382272
2.Get NN(Nearest Neighbor)
In ths step we will train the model to find K nearest neighbors
Function
def Get_Neighbors(train, test_row, num): """ 1. train data you have 5 Data Points. 2. in test_row you have only 1 point 3. num we have number of Neighbors a. We will get 5 Diff Dist. b . sort our data according to near dist. c. We will collect num points. """ distance = list() # [] data = [] for i in train: dist = Euclidean_distance(test_row, i) distance.append(dist) data.append(i) distance = np.array(distance) data = np.array(data) """ we are finding index of min distance """ index_dist = distance.argsort() """ we arange our data acco. to index """ data ?= data[index_dist] """ we are slicing num number of datas """ neighbors = data[:num] return neighbors
Now we will call the function
Get_Neighbors(dataset, test, 4)
output
array([[ 7.62753121, 2.75926224, 1. ], [ 7.67375647, 3.50856301, 1. ], [ 6.92259672, 1.77106367, 1. ],
[ 8.67541865, -0.24206865, 1. ]])
3.Make Prediction
In this step we will define another function to predict values
def predict_classification(train, test_row, num): Neighbors = Get_Neighbors(train, test_row, num) Classes = [] for i in Neighbors: Classes.append(i[-1]) prediction = max(Classes, key= Classes.count)
return prediction
We will call the function
predict_classification(dataset, test, 6)
output
1.0
Verifying the result
prediction = predict_classification(dataset, test, 4)
print("We expected {}, Got {}".format(test[-1], prediction))
output:
We expected 1, Got 1.0
Sources:
- Images are taken from::
- https://www.techsimplus.com/
- ?shorturl.at/EQ129