Hi folks, Today, while reading an article, I wondered how many ways there are to categorize the ML algorithms. Here comes my research. Machine learning algorithms can be categorized into several different types based on their underlying principles and methodologies. Here are some common categorizations:
1. Supervised vs. Unsupervised Learning
- Supervised Learning: Algorithms are trained on labeled data. The goal is to learn a mapping from inputs to outputs based on this training data.
- Examples: Linear Regression, Logistic Regression, Decision Trees, Support Vector Machines (SVMs), Neural Networks, K-Nearest Neighbors (KNN).
- Unsupervised Learning: Algorithms are trained on unlabeled data. The goal is to find patterns or structures within the data.
- Examples: K-Means Clustering, Hierarchical Clustering, Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE).
2. Regression vs. Classification
- Regression: Algorithms that predict continuous values.
- Classification: Algorithms that predict discrete classes or categories.
3. Model-Based vs. Instance-Based
- Model-Based: Algorithms that build a model based on the training data and use this model for predictions.
- Instance-Based: Algorithms that use the training data directly to make predictions without building an explicit model.
4. Linear vs. Non-Linear Models
- Linear Models: Algorithms that assume a linear relationship between features and the target variable.
- Non-Linear Models: Algorithms that do not assume a linear relationship and can model complex relationships.
5. Parametric vs. Non-Parametric Models
- Parametric Models: Algorithms that assume a specific form for the function that maps inputs to outputs and are characterized by a finite number of parameters.
- Non-Parametric Models: Algorithms that do not assume a specific form for the function and can adapt to the complexity of the data.
6. Ensemble Methods
- Ensemble Methods: Techniques that combine multiple models to improve performance. Examples: Bagging (e.g., Random Forests), Boosting (e.g., Gradient Boosting Machines, AdaBoost), Stacking.
7. Optimization-Based vs. Distance-Based
- Optimization-Based: Algorithms that rely on optimization techniques to find the best model parameters.
- Distance-Based: Algorithms that make decisions based on the distance between data points.
8. Tree-Based vs. Non-Tree-Based
- Tree-Based: Algorithms that use tree structures to make decisions.
- Non-Tree-Based: Algorithms that do not use tree structures.