K-Nearest Neighbors
The K-Nearest Neighbors (KNN) algorithm is a simple yet powerful supervised machine-learning technique used for classification and regression tasks. It works based on the idea that similar data points are often near each other in feature space.
Applications of KNN
Key Concepts of KNN
1. Instance-based learning:
* KNN does not explicitly learn a model but memorizes the training dataset. *
* Predictions are made based on the similarity of a new data point to existing instances.
2. Distance Metric:
KNN relies on measuring the distance between data points. Common distance metrics include:
* Euclidean distance
* Manhattan distance
* Minkowski distance
* Cosine similarity (for high-dimensional data)
3. Number of Neighbors (K):
* The parameter K determines how many nearest neighbors are considered for classification or regression.
* Small K may lead to noisy predictions (overfitting), while large K may oversimplify the model (underfitting).
领英推荐
4. Weighted Voting (optional):
* Neighbors can have weights based on their distance from the query point, giving closer points more influence.
KNN for Classification
KNN for Regression
Advantages of KNN
Disadvantages of KNN
Here is the Python Script/Code for the KNN Classification/Prediction.
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Example Dataset (Iris)
from sklearn.datasets import load_iris
data = load_iris()
# Splitting data
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)
# Creating and fitting the KNN model
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
# Predictions and accuracy
y_pred = knn.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
Principal Application Engineer/ Solution Designing & Architecture
2 个月Thanks Aggranda.