登录查看更多内容

Introduction to Machine Learning Algorithms

Himanshu T.

Data Science (ML/AI), Strategy & Advisory, Business Insights

发布日期: 2021年6月8日

If anyone is looking for a brief overview of the machine learning algorithms list, this article provides some important context to the layout of how machine learning algorithms are structured and how one can start to delve into the individual topics. As always, Wikipedia is a great source to get started on individual topics.

First, machine learning algorithms are classified into 4 major buckets of learning processes. They are as follows:

Supervised Learning: Mathematics is a language of statistics and it deals with functional forms. This form Y = f(X) has always carried a definition for a dependent variable based on a defined function (f). Supervised learning inherently captures this functional form to better predict Y. It is supervised by many defined functional algorithms that define Y. Machines (or programs) can try to find this generalized rule to map out many inputs to the outcome variable Y.

Unsupervised Learning: It is very difficult for humans to understand data without any context or (in DB parlance — no metadata). It simply means that our observational data points weren’t really given specific features to select from. Many data points were gathered for each observation that did not have a feature definition. Now, it's the algorithm that needs to find structure in its inputs. With no labels, the algorithm needs to discover hidden patterns in data or establish features that were not observed before. This type of algorithm falls under unsupervised learning, meaning there is no functional form driving the prediction of output.

Semi-supervised Learning: This is something you might observe in data where you have observations with labels and some don’t. Meaning the metadata is incomplete and yet many ML engineers or Data Scientists were able to find useful insights when the complete set of the label and unlabeled data were used in conjunction to improve the accuracy of learning.

Reinforcement Learning: This is the more complicated but exciting part of a machine learning algorithm where an agent (usually a program or software) tries to learn from its environment by taking feedback. The program has to take action to maximize its reward points that are defined by the originator of the program. The field is studied extensively when many agents are competing for similar rewards systems. Disciplines such as game theory, control theory, operations research, information theory, simulation-based optimization, etc come under its preview.

Below is a list of ML Algorithms (for more information, refer to Wikipedia.org) that fall under the 4 major classification categories defined above.

Regression-based Algorithms

In statistics, linear regression is a linear approach to modeling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression.

Ordinary Least Squares Regression (OLSR)
Linear Regression
Logistic Regression
Stepwise Regression
Multivariate adaptive regression splines (MARS)
Locally Estimated Scatterplot smoothing (LOESS)

https://en.wikipedia.org/wiki/Linear_regression

Instance-based algorithms

Instance-based learning (memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory.

K-nearest neighbor (kNN)
Learning Vector Quantization (LVQ)
Self-Organizing Map (SOM)
Locally weighted learning (LWL)
Support Vector Machines (SVM)

https://en.wikipedia.org/wiki/Instance-based_learning

Regularization Algorithms

Regularization is a technique used in regression to reduce the complexity of the model and to shrink the coefficients of the independent features.

Ridge Regression
Least Absolute Shrinkage and Selection Operator (LASSO)
Elastic Net
Least-Angle Regression (LARS)

https://medium.com/analytics-vidhya/understanding-regularization-algorithms-450777fa0ed3

Decision Tree Algorithms

Decision Tree algorithms belong to the family of supervised learning algorithms. The goal of using a decision tree is to create a training model that can be used to predict the class or value of the target variables by learning simple decision rules inferred from prior data.

Classification and regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5 and C5.0 (different versions of a powerful approach)
Chi-squared Automatic Interaction Detection (CHAID)
Decision Stump
M5
Conditional Decision Trees

https://en.wikipedia.org/wiki/Decision_tree

Bayesian Algorithms

A family of algorithms where all of them share a common principle, I.e. every pair of features being classified is independent of each other. Naive Bayes classifiers are a collection of classification algorithms based on Bayes’s theorem. Bayes’ formula provides a relationship between P(A/B) and P(B/A).

Naive Bayes
Gaussian Naive Bayes
Multinomial Naive Bayes
Averaged One-Dependence Estimators (AODE)
Bayesian Belief Network (BBN)
Bayesian Network (BN)

https://towardsdatascience.com/ml-algorithms-one-sd-%CF%83-bayesian-algorithms-b59785da792a

Clustering Algorithms

“Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters)”

K-Means
K-medians
Expectation-Maximization (EM)
Hierarchical Clustering

https://en.wikipedia.org/wiki/Cluster_analysis

Association Rule Learning Algorithms

“Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.”

Apriori Algorithms
Eclat Algorithms

https://en.wikipedia.org/wiki/Association_rule_learning

Artificial Neural Network Algorithms

“Artificial neural networks (ANNs), usually simply called neural networks (NNs), are computing systems vaguely inspired by the biological neural networks that constitute animal brains.”

Perceptron
Multilayer Perceptrons (MLP)
Back-Propagation
Stochastic Gradient Descent
Hopfield Network
Radial Basis Function Network (RBFN)

https://en.wikipedia.org/wiki/Artificial_neural_network

Deep Learning Algorithms

“Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised”

Convolution Neural Network (CNN)
Recurrent Neural Networks (RNNs)
Long Short-Term Memory Networks (LSTMs)
Stacked Auto-Encoders
Deep Boltzmann Machine (DBM)
Deep Belief Networks (DBN)

https://en.wikipedia.org/wiki/Deep_learning

Dimensionality Reduction Algorithms

“Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension”

Principal Component Analysis (PCA)
Principal Component Regression (PCR)
Partial Least Squares Regression (PLSR)
Sammon Mapping
Multidimensional Scaling (MDS)
Projection Pursuit
Linear Discriminant Analysis (LDA)
Mixture Discriminant Analysis (MDA)
Quadratic Discriminant Analysis (QDA)
Flexible Discriminant Analysis (FDA)

https://en.wikipedia.org/wiki/Dimensionality_reduction

Ensemble Algorithms

“In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone”

Boosting
Bootstrapped Aggregation (Bagging)
AdaBoost
Weighted Average (Blending)
Stacked Generalization (Stacking)
Gradient Boosting Machines (GBM)
Gradient Boosted Regression Trees (GBRT)
Random Forest

https://en.wikipedia.org/wiki/Ensemble_learning

Pasunuri Prathiba

Senior Data Scientific @ BOSCH | AI | M.Tech at IIT Bombay

3 年

Very useful, thank you.

要查看或添加评论，请登录

Himanshu T.的更多文章

Why Beta-Geometric / Beta-Binomial (BG/BB) model is most commonly used in Customer Retention and Churn Analysis?

2025年2月4日

Why Beta-Geometric / Beta-Binomial (BG/BB) model is most commonly used in Customer Retention and Churn Analysis?

In today's competitive market, businesses strive to understand customer behavior to improve retention and minimize…
The Role of AI and Machine Learning in Fraud Detection

2025年1月29日

The Role of AI and Machine Learning in Fraud Detection

Fraud is an ever-evolving challenge for organizations across industries, with criminals leveraging sophisticated…
Essential guide to understanding Time Series Analysis and different models

2025年1月28日

Essential guide to understanding Time Series Analysis and different models

Time series analysis is a powerful statistical technique used to extract meaningful insights from data collected…
Useful Clustering Algorithms (Unsupervised Learning) Build From Understanding Nature

2025年1月27日

Useful Clustering Algorithms (Unsupervised Learning) Build From Understanding Nature

When working with unsupervised learning algorithms, we stumbled upon this every important topic on clustering algorithm…
Normalization Techniques Used in Machine Learning to Transform Features

2025年1月12日

Normalization Techniques Used in Machine Learning to Transform Features

Data preprocessing is one of the most crucial steps in a machine learning pipeline. Among the various preprocessing…
Introduction to Prompt Engineering, Linguistics and Language Models

2025年1月10日

Introduction to Prompt Engineering, Linguistics and Language Models

In the rapidly evolving field of artificial intelligence, prompt engineering has emerged as a crucial discipline. It is…

2 条评论
Several Powerful NLP (Natural Language Processing) Models

2025年1月9日

Several Powerful NLP (Natural Language Processing) Models

BERT (Bidirectional Encoder Representations from Transformers) BERT (Bidirectional Encoder Representations from…
Effective Ad Campaign Management Through Auction-Based Bidding Strategies

2024年11月27日

Effective Ad Campaign Management Through Auction-Based Bidding Strategies

In the digital advertising landscape, managing ad campaigns effectively requires more than just creating compelling ads…
Understanding the Evolution of Programmatic Advertising: A Comprehensive Guide to How It Works

2024年11月25日

Understanding the Evolution of Programmatic Advertising: A Comprehensive Guide to How It Works

Programmatic advertising's roots traces back to early days of the internet in the 1990s when digital ads first emerged.…
The Rise of Sponsored Search Advertising in Retail media

2024年11月18日

The Rise of Sponsored Search Advertising in Retail media

The rise of sponsored search as a core advertising format for retailers and their 3P sellers is one of the most…

See all articles

Introduction to Machine Learning Algorithms

Himanshu T.

Data Science (ML/AI), Strategy & Advisory, Business Insights

Regression-based Algorithms

Instance-based algorithms

Regularization Algorithms

Decision Tree Algorithms

Bayesian Algorithms

Clustering Algorithms

Association Rule Learning Algorithms

Artificial Neural Network Algorithms

Deep Learning Algorithms

Dimensionality Reduction Algorithms

Ensemble Algorithms

Himanshu T.的更多文章

社区洞察

其他会员也浏览了

Machine Learning Algorithms & Implementations

Understanding Different Types of Machine Learning Algorithms - Exploring Machine Learning Algorithms and Services - InbuiltData

4 algorithms machine learning engineers should know

Top 10 Machine Learning Algorithms You Must Know in 2025

World of Machine Learning

10 Machine Learning Methods that Every Data Scientist Should Know

Exploring The Impact Of Machine Learning On Various Industries

Machine Learning: A Bird's Eye View

10 Machine Learning Algorithms You Need to Know

What is Machine Learning? Article by Saurav Mukherjee

Regression-based Algorithms

Instance-based algorithms

Regularization Algorithms

Decision Tree Algorithms

Bayesian Algorithms

Clustering Algorithms

Association Rule Learning Algorithms

Artificial Neural Network Algorithms

Deep Learning Algorithms

Dimensionality Reduction Algorithms

Ensemble Algorithms

Himanshu T.的更多文章

Why Beta-Geometric / Beta-Binomial (BG/BB) model is most commonly used in Customer Retention and Churn Analysis?

The Role of AI and Machine Learning in Fraud Detection

Essential guide to understanding Time Series Analysis and different models

Useful Clustering Algorithms (Unsupervised Learning) Build From Understanding Nature

Normalization Techniques Used in Machine Learning to Transform Features

Introduction to Prompt Engineering, Linguistics and Language Models

Several Powerful NLP (Natural Language Processing) Models

Effective Ad Campaign Management Through Auction-Based Bidding Strategies

Understanding the Evolution of Programmatic Advertising: A Comprehensive Guide to How It Works

The Rise of Sponsored Search Advertising in Retail media

社区洞察

其他会员也浏览了

Machine Learning Algorithms & Implementations

Understanding Different Types of Machine Learning Algorithms - Exploring Machine Learning Algorithms and Services - InbuiltData

4 algorithms machine learning engineers should know

Top 10 Machine Learning Algorithms You Must Know in 2025

World of Machine Learning

10 Machine Learning Methods that Every Data Scientist Should Know

Exploring The Impact Of Machine Learning On Various Industries

Machine Learning: A Bird's Eye View

10 Machine Learning Algorithms You Need to Know

What is Machine Learning? Article by Saurav Mukherjee