登录查看更多内容

What is Machine Learning? Article by Saurav Mukherjee

Saurav Mukherjee

Security Analyst @EY GDS | Former SDE Intern @SMIIT | Top 50 Samsung Solve for Tomorrow | Building @CodeIN Community | Google DSC Lead '22

发布日期: 2021年8月29日

Machine learning is a branch of Artificial Intelligence, Where we Train the machines to perform and predict the outcomes. Like YouTube suggesting videos in your feed, Machine Learning brings out the power of data in a new way. Working on the development of computer programs that can access data and perform tasks automatically through predictions and detections, Machine Learning enables computer systems to learn and improve from experience continuously.

How does Machine Learning work?

Machine Learning is?the most?exciting subset of?AI. The Machine Learning process starts with inputting training data into?the chosen?algorithm. Training data being known or unknown data to develop?the ultimate?Machine Learning algorithm.?the kind?of training?data input does impact the algorithm,?which?concepts?are going to be?covered further momentarily.

To test whether this algorithm works correctly, the new?input file?is fed into the Machine Learning algorithm. The prediction and results are then checked.

If the prediction?isn't?evident, the algorithm is re-trained multiple numbers of times until?the required?output is found.?this permits?the Machine Learning algorithm?to continually?learn on its own and produce?the foremost?optimal answer?which will?gradually increase in accuracy over time.

What are the Different Types of Machine Learning?

1. Supervised Learning: In supervised learning, we use known or labeled data for the training data. Since the?data?is known, the learning is, therefore, supervised, i.e., directed into successful execution. The input data goes through the Machine Learning algorithm and is used to train the model. Once the model is trained based on the known data, you can use unknown data into the model and get a new response.

Here is the list of top algorithms we use for supervised learning :

Polynomial regression
Random forest
Linear regression
Logistic regression
Decision trees
K-nearest neighbors
Naive Bayes

2. Unsupervised Learning: In unsupervised learning, the training data is unknown and unlabeled – meaning that no one has looked at the data before. Without the aspect of known data, the input cannot be guided to the algorithm, which is where the unsupervised term originates from. This data is fed to the Machine Learning algorithm and is used to train the model. The trained model tries to search for a pattern and give the desired response.?

In this case, the unknown data consists of apples and pears which look similar to each other. The trained model tries to put them all together so that you get the same things in similar groups.

The top algorithms we use for unsupervised learning are:

Partial least squares
Fuzzy means
Singular value decomposition
K-means clustering
Hierarchical clustering
Principal component analysis

3. Reinforcement Learning:

Like traditional types of data analysis, here, the algorithm discovers data through a process of trial and error and then decides what action results in higher rewards. Three major components make up reinforcement learning: the agent, the environment, and the actions. The agent is the learner or decision-maker, the environment includes everything that the agent interacts with, and the actions are what the agent does.

In Reinforcement learning, the machine learns from its own mistakes and predicts the data without errors.

Prerequisites for Machine Learning (ML)

1. Basic knowledge of programming languages such as Python / R etc.

2. Basic knowledge of linear algebra. In the linear regression model, a line is drawn through all the data points, and that line is used to compute new values.

3. Knowledge of python libraries such as NumPy, pandas and familiar with notebooks such as Jupyter Notebook.

4. Familiar with datasets such Sckit-learn, Kaggle, etc.

5. Intermediate knowledge of statistics and probability and Knowledge of how to clean and structure raw data to the desired format to reduce the time taken for decision-making.

Resources to Get started with Machine learning :

* Learn Basics of Machine learning in 10 Hours: https://www.youtube.com/watch?v=GwIo3gDZCVQ&t=5353s

*Roadmap to Master ML from Zero to Pro: https://drive.google.com/file/d/1kNsPhD4JMhS2F7kH9DvBb8wvt9XCJU8N/view?usp=sharing

* Machine Learning Full Course- Andrew Ng, Stanford University: https://www.youtube.com/playlist?list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN

*Datasets:

https://www.kaggle.com/

https://www.tensorflow.org/datasets

https://data.world/

https://datasetsearch.research.google.com/

*Mathematics for Machine learning: https://www.coursera.org/specializations/mathematics-machine-learning

*Python for Beginners: https://www.youtube.com/watch?v=rfscVS0vtbw

Projects :

ML Prediction Model: https://github.com/SauravMukherjee44/Machine-Learning-Task-1

Email Classification ML Model: https://github.com/SauravMukherjee44/Email-Classification-Model

InbuiltData 4 个月前

50 Key Definitions in Machine Learning

Dr. John Martin 10 个月前

How does Machine Learning work?

Dr. John Martin 9 个月前

Linear Regression

from sklearn.linear_model import LinearRegression
model = Linear_Regression()

Logistic Regression

from sklearn.linear_model import LogisticRegression
model = LogisticRegression()

Naive Bayes

from sklearn.naive_bayes import GaussianNB,MultinomialNB
model = GaussianNB()
model = MultinomialNB()

Decision Tree

from sklearn.tree import DecisionTreeclassifier
model = DecisionTreeClassifier()

Support Vector Machine

from sklearn.svm import SVC
model = SVC(gamma=10)

K- Fold- Cross-Validation

from sklearn.model_selection import cross_val_score

Random Forest

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=10)

Algorithms in detail :

Linear Regression

It is a linear approach to a model that makes a linear line in the data by dividing them into two classes. Most effectively, the linear model is a good verse used for classification-based problems. It states the model with inputs and targets in the data.

Logistic Regression

It is a statistical model built on a logistic function to the binary dependent variable. It is quite similar to linear regression where it is used to predict categorical dependent variables.

Naive Bayes

It is a supervised learning algorithm that makes the best and fast predictions more accurate. And it’s set for classification problems, besides( my fav algorithm) it has two types of models Gaussian NB, Multinomial NB.

Decision Tree

In the Decision tree, the data is continuously split according to certain parameters. It has entities such as decision nodes & leaves. Based on these entities the decision would be made by the algorithm. Mostly used for both classification and regression problems.

Support Vector Machine

Support Vector Machine(SVM) works with a hyperplane between two data points of features and labels. Hyperplane states as decision boundaries for two different data points by initiating them with red and blue colors.

K- Fold- Cross-Validation

Cross-Validation estimates the performance of the machine learning algorithm. It splits data into “K” points such as k=2, k=4. These splitting points are called the folds invalidation algorithm.

Random Forest

Random forest is simpler and similar to decision trees whereas in random forest its a combination of decision trees to get more accurate and works faster. It has estimators in the sense number of decision trees to make up. It predicts best outcomes than individual trees.

K-means Clustering

It is an Un-Supervised Learning which has grouped data without labels. Using Clustering the algorithm carves up based on their features into groups with the help of “K” in the data points.

K-Nearest Neighbors

It is the finest algorithm in ML that deals with both classification and regression problems. KNN uses data to classify the latest data based on their similarity features in data. In simple words, it stores previous data as a record whenever the new data occurs it starts searching in old records of its data.

Hierarchical clustering

It is Un-Supervised learning with clusters analysis, which makes unlabeled data into clusters. Clusters in the sense it picks the bunch of related data points with their property into groups.

Principle Component Analysis

PCA is Un-Supervised Learning that is used as a dimensionality reduction for large datasets to reduce them into smaller but still consists of similar features that are in huge datasets. Most often, it is used to filter the noisy datasets for images such as Resolution, Compression.

Gradient Descent

It is the optimized algorithm for minimizing the cost of a function in the form of steepest descent by moving in opposite direction for?Global loss minimum.

It is the heart of Machine Learning & Deep Learning.
w=w-alpha*Gradient
Gradient=sigma loss/sigma w
alpha = Learning rate

Thanks for reading the whole article, Do connect with me on Socials mentioned below if any help is needed.

GitHub: https://github.com/SauravMukherjee44

LinkedIn: https://www.dhirubhai.net/in/sauravmukherjee44/

Portfolio: https://sauravmukherjee44.github.io/Portfolio-Saurav-Mukherjee/

YouTube: https://www.youtube.com/channel/UCYGVtIgQIAChKBWBmChxzJw

Gourav Chatterjee

Student at SSVM school

3 年

Love this article ??

Sourav Kairi

18K+ Followers || IIT Jodhpur(Batch of 2024-26) || AI & Analytics Department, Cognizant || Best New Comer award winner || Expertise in Python, SQL, Power BI || Interested in GenAI.

3 年

Carry on

1 次回应

Amritanshu Gupta

3 年

Great

1 次回应

Saurav Mukherjee

Security Analyst @EY GDS | Former SDE Intern @SMIIT | Top 50 Samsung Solve for Tomorrow | Building @CodeIN Community | Google DSC Lead '22

3 年

Do check out my article on Machine learning from the link below : https://www.dhirubhai.net/pulse/what-machine-learning-saurav-mukherjee

1 次回应

查看更多评论

要查看或添加评论，请登录

What are the Different Types of Machine Learning?

3. Reinforcement Learning:

Prerequisites for Machine Learning (ML)

*Datasets:

Projects :

领英推荐

Linear Regression

Logistic Regression

Naive Bayes

Decision Tree

Support Vector Machine

K- Fold- Cross-Validation

Random Forest

Algorithms in detail :

Linear Regression

Logistic Regression

Naive Bayes

Decision Tree

Support Vector Machine

K- Fold- Cross-Validation

Random Forest

K-means Clustering

K-Nearest Neighbors

Hierarchical clustering

Principle Component Analysis

Gradient Descent

社区洞察

其他会员也浏览了

Artificial Intelligence #48: How do we combine statistical thinking and machine learning?

World of Machine Learning

10 Machine Learning Methods that Every Data Scientist Should Know

What is machine learning and how does it work?

How i am learning machine learning - part 0: machine learning algorithms

Machine Learning

Machine Learning & It's use cases

ML Models

Machine Learning: A Brief Overview

Introduction to Machine Learning: A Beginner's Guide