Machine Learning Models in 5 min without Math and Code.

Machine Learning Models in 5 min without Math and Code.


Fundamental Approach

Two fundamental Approaches of Modeling

Supervised and unsupervised modeling are two fundamental approaches in machine learning, each serving distinct purposes.

Supervised learning involves training a model on a labeled dataset, meaning the data is accompanied by corresponding correct output labels. The model learns to map input data to the correct output by finding patterns in the labeled data. The process involves feeding the model with input-output pairs, where the model makes predictions and is corrected based on the known labels. Over time, the model adjusts to minimize errors and improve its predictions.

Unsupervised learning involves training a model on a dataset that does not have labeled responses. The model tries to learn the underlying structure or distribution in the data without explicit instructions. The model identifies patterns, clusters, or associations in the data based solely on the input features. There is no feedback or correction based on output since the data isn't labeled.

Models of Supervised Learning:

Regression and Classification models comes under the category of supervised learning. Classification is predicting the category of an object, such as spam detection in emails (spam or not spam). On the other hand, Regression is predicting continuous values, such as forecasting house prices based on features like location and size.

Regression Models


Regression in Experience and Salary

  1. Linear / Multiple Regression: A simple approach that assumes a linear relationship between the input features (independent variables) and the output (dependent variable). The model fits a straight line (in the case of one feature) or a hyperplane (in the case of multiple features) that minimizes the sum of squared differences between observed and predicted values. For instance, predicting house prices based on factors like area, number of rooms, and location.
  2. Polynomial Regression: An extension of linear regression where the relationship between the independent and dependent variables is modeled as an nth-degree polynomial. This allows for capturing more complex, non-linear relationships. For instance, modeling growth rates in biological systems that do not follow a straight line.
  3. Decision Tree Regression: A non-linear regression model that splits the data into subsets based on feature values, forming a tree-like structure. Each node represents a decision based on a feature, and each leaf represents a predicted value. For instance, predicting sales based on various conditions, such as time of year and market conditions.
  4. Random Forest Regression: An ensemble method that uses multiple decision trees to improve predictive performance and reduce overfitting. Each tree is trained on a random subset of the data, and the final prediction is an average of all tree predictions. For instance, predicting stock prices by combining different market indicators.
  5. Neural Networks for Regression: A probabilistic model that incorporates prior beliefs about the parameters and updates them based on observed data. This approach provides a full distribution of possible outcomes, not just point estimates. For instance, estimating uncertainty in predictions, such as predicting the cost of a project with uncertain variables.


Classification Models

Confusion Matrix used for classification model accuracy check

  1. Logistic Regression: is a popular and foundational model used for classification tasks, especially when the outcome variable is binary (i.e., it has two possible classes). Despite its name, logistic regression is not a regression model in the traditional sense (which predicts continuous outcomes); rather, it is used for predicting categorical outcomes. For instance, predicting whether a patient has a certain disease (yes or no) based on various medical tests and symptoms.
  2. Support Vector Machine (SVM): A powerful model that finds the hyperplane that best separates classes in the feature space. SVM can be used for both linear and non-linear classification by using kernel functions to map data into higher dimensions. For instance, image classification, such as distinguishing between cats and dogs.
  3. Na?ve Bayes: A probabilistic model based on Bayes' theorem. It assumes that features are independent given the class label (the "na?ve" assumption). Despite its simplicity, it works well for many real-world tasks, especially text classification. For instance, sentiment analysis in social media posts (positive or negative sentiment).
  4. Decision Tree: A tree-like model that splits the data into subsets based on feature values, with each node representing a decision and each leaf representing a class label. It can handle both categorical and continuous data. For instance, predicting customer churn in a subscription-based service.
  5. Random Forest: An ensemble method that combines multiple decision trees to improve classification accuracy and robustness. Each tree is trained on a random subset of the data, and the final classification is made by majority vote among the trees. For instance, classifying loan applicants as high or low risk.
  6. Neural Networks: Complex models composed of layers of interconnected neurons that can learn non-linear decision boundaries. Neural networks are especially powerful for tasks involving high-dimensional data. For instance, classifying images, such as in facial recognition systems.

Models of Unsupervised Learning:

Clustering and models are used in unsupervised learning to group similar data points into clusters based on their features. The goal is to find natural groupings within the data without predefined labels. Here are some common clustering models. Dimensionality reduction models in unsupervised learning are techniques used to reduce the number of input features (dimensions) in a dataset while retaining as much of the important information as possible. These models are particularly useful when dealing with high-dimensional data, as they help to simplify the dataset, reduce computational costs, and often improve the performance of machine learning models.

Clustering Models

Four important clustering Models

  1. K-Means Clustering: K-Means is one of the simplest and most widely used clustering algorithms. It partitions the data into k clusters, where each data point belongs to the cluster with the nearest mean. The algorithm iteratively updates the cluster centroids until convergence. For instance, Customer segmentation in marketing, where customers are grouped based on purchasing behavior.
  2. Hierarchical Clustering: This method creates a hierarchy of clusters by either merging smaller clusters into larger ones (agglomerative) or splitting larger clusters into smaller ones (divisive). The result is a dendrogram, a tree-like diagram that shows the arrangement of the clusters. For instance, grouping genes with similar expression patterns in bioinformatics.
  3. DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN groups together points that are closely packed together (high density) and marks points that lie alone in low-density regions as outliers. It does not require specifying the number of clusters in advance. For instance, identifying clusters of varying shapes and sizes in spatial data, such as geographical locations.
  4. Mean Shift Clustering: A non-parametric clustering technique that seeks to find the modes (peaks) in the data distribution by iteratively shifting data points towards the mode. It automatically determines the number of clusters based on the data distribution. For instance, image segmentation, where regions in an image are grouped based on color intensity.


Dimensionality reduction models

PCA are RFE two important techniques in Feature Reduction

  1. Principal Component Analysis (PCA): PCA is a linear technique that transforms the data into a new coordinate system by finding the directions (principal components) that maximize the variance in the data. The first principal component captures the most variance, the second the next most, and so on. By selecting a subset of these components, the dimensionality of the data can be reduced. For instance, reducing the number of features in image compression or facial recognition.
  2. Recursive Feature Elimination (RFE): RFE is a feature selection method that recursively removes the least important features based on the model's performance. It works by fitting a model (like a linear regression or a support vector machine) and ranking the features according to their importance. The least important features are removed, and the model is refit until the desired number of features is reached. For instance, Selecting the most predictive variables in a dataset with a large number of features for a classification or regression task.


PS: Above are just fundamentals and glimpse of Data Science models. There are other models as well and each year new models and techniques are produced by data scientist.

要查看或添加评论,请登录

Billal Pervaz的更多文章

社区洞察

其他会员也浏览了