TOP 10 MACHINE LEARNING ALGORITHMS
Machine Learning algorithms are used within various industries, it can be OTT Platforms’ recommendation systems, people or customer analytics or sales analytics in Walmart, or recommendation systems within e-commerce industry such as Amazon, Flipkart and Myntra, house prediction systems in Real-estate industry, humanoid-robots and other uses. But what are those top 10 frequent algorithms?
So, let’s get started a talk about these models.
REGRESSION ALGORITHMS (SUPERVISED MACHINE LEARNING)
- Linear Regression
Linear regression algorithms are used when there are only one dependent variable and one independent variable, which are denoted by y and x respectively. These algorithms examine two factors:
· How closely are x and y related?
It gives a number between -1 to 1, which indicates the correlation between both of the variables.
Where,
0 indicates no relation
1 indicates positive correlation
-1 indicates negative correlation
· Prediction
When we know about x and y, and the model is used to predict the results for unknown x variables. It is done by fitting a linear relationship and which is represented as
Y = mx + c
x = independent variable
y = dependent variable
m = weightage
c = intercept
Instead a straight line if we get a curve then we follow an another technique named as Polynomial Regression Model. Polynomial regression is applied when we do not get a linear function or representation.
CLASSIFICATION ALGORITHMS (UNSUPERVISED MACHINE LEARNING)
2. Logistic Regression
It is used to predict binary outcomes for a given dataset of independent variables. It always serves binary outcomes such as 0 or 1, win or lose, day or night, pass or fail and healthy or sick. That is why, we always get discrete values y belongs to {0,1}.
3. Na?ve Bayes
Na?ve Bayes can be defined as the classification techniques of the supervised machine learning based on Baye’s theorem.
4. Decision Tree Algorithm
A decision-tree is a flowchart- like tree structure, which consist internal node, branch and leaf node. In which, internal node refers feature (or attribute), branch represents a decision rule, and all leaf nodes represent the outcome, the top most node is known as the root node.
5. Support Vector Machine Algorithm
The main goal of a support vector machine algorithm is to find a hyperplane in an N-dimensional space (N is the number of features within the whole dataset) that distinctly classifies the data points. While applying the algorithm to datasets, there are many possible hyperplanes which can divide the data sets. But our main objective is to find that hyperplane which has the maximum margin, where the maximum margin is the maximum distance between two data points of both of the classes.
6. K-Nearest Neighbors (KNN) Machine Algorithm
KNN is a simple supervised machine learning algorithm, which can be used in both, classification and regression problems. It captures the idea of similarity (such as distance, closeness or proximity) with using some mathematical concepts. It performs better with less dimensions or it can be said as if number of features will be increased, then system will need more datasets with same labels.
DIMENSIONALITY REDUCTION ALGORITHMS
7. Linear Discriminant Analysis
LDA (Linear Discriminant Analysis) is a technique of dimensionality reduction, which is used as pre-processing step in machine learning and used in classification.
The main function of LDA is to project the features in higher dimension space, this process can be split into three parts:
- Calculation of between- class variance
- Calculation of within class variance
- Fisher’s criterion
8. Principal Component Analysis (PCA)
Principal Component Analysis, or PCA, is a dimensionality-reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.
Reducing the number of variables of a data set naturally comes at the expense of accuracy, but the trick in dimensionality reduction is to trade a little accuracy for simplicity. Because smaller data sets are easier to explore and visualize and make analyzing data much easier and faster for machine learning algorithms without extraneous variables to process.
CLUSTERING ALGORITHMS (UNSUPERVISED MACHINE LEARNING)
9. K-Means Clustering
K-Means Clustering is one of the simplest and frequent algorithms of unsupervised learning, where K can be defined as the targeted number of clusters within the whole dataset and the centroid is considered as the imaginary or real location representation of center of the cluster, where a cluster refers to a group of data points having certain similarities.
It starts with a first group of randomly selected centroids, which are used as the initial or beginning points for all clusters and then performs iterative or repetitive calculations to optimize the locations of the centroids.
10. Hierarchical Clustering
Hierarchical clustering analysis, is an algorithm that groups similar objects into groups called clusters. In hierarchical clustering, we assign each object to a separate cluster. It computes the distance between each of the clusters and join the wo most similar clusters.
There are two types of methods to perform Hierarchical Clustering:
- Agglomerative: It is a top-down clustering method.
- Divisive: It is a bottom-up clustering method.
IIT Dhanbad || Software Developer
3 年Bhai bhai ??????????
EEE | NITM'23
3 年????????????