What is Machine Learning? Article by Saurav Mukherjee
Saurav Mukherjee
Security Analyst @EY GDS | Former SDE Intern @SMIIT | Top 50 Samsung Solve for Tomorrow | Building @CodeIN Community | Google DSC Lead '22
Machine learning is a branch of Artificial Intelligence, Where we Train the machines to perform and predict the outcomes. Like YouTube suggesting videos in your feed, Machine Learning brings out the power of data in a new way. Working on the development of computer programs that can access data and perform tasks automatically through predictions and detections, Machine Learning enables computer systems to learn and improve from experience continuously.
How does Machine Learning work?
Machine Learning is?the most?exciting subset of?AI. The Machine Learning process starts with inputting training data into?the chosen?algorithm. Training data being known or unknown data to develop?the ultimate?Machine Learning algorithm.?the kind?of training?data input does impact the algorithm,?which?concepts?are going to be?covered further momentarily.
To test whether this algorithm works correctly, the new?input file?is fed into the Machine Learning algorithm. The prediction and results are then checked.
If the prediction?isn't?evident, the algorithm is re-trained multiple numbers of times until?the required?output is found.?this permits?the Machine Learning algorithm?to continually?learn on its own and produce?the foremost?optimal answer?which will?gradually increase in accuracy over time.
What are the Different Types of Machine Learning?
1. Supervised Learning: In supervised learning, we use known or labeled data for the training data. Since the?data?is known, the learning is, therefore, supervised, i.e., directed into successful execution. The input data goes through the Machine Learning algorithm and is used to train the model. Once the model is trained based on the known data, you can use unknown data into the model and get a new response.
Here is the list of top algorithms we use for supervised learning :
2. Unsupervised Learning: In unsupervised learning, the training data is unknown and unlabeled – meaning that no one has looked at the data before. Without the aspect of known data, the input cannot be guided to the algorithm, which is where the unsupervised term originates from. This data is fed to the Machine Learning algorithm and is used to train the model. The trained model tries to search for a pattern and give the desired response.?
In this case, the unknown data consists of apples and pears which look similar to each other. The trained model tries to put them all together so that you get the same things in similar groups.
The top algorithms we use for unsupervised learning are:
3. Reinforcement Learning:
Like traditional types of data analysis, here, the algorithm discovers data through a process of trial and error and then decides what action results in higher rewards. Three major components make up reinforcement learning: the agent, the environment, and the actions. The agent is the learner or decision-maker, the environment includes everything that the agent interacts with, and the actions are what the agent does.
In Reinforcement learning, the machine learns from its own mistakes and predicts the data without errors.
Prerequisites for Machine Learning (ML)
1. Basic knowledge of programming languages such as Python / R etc.
2. Basic knowledge of linear algebra. In the linear regression model, a line is drawn through all the data points, and that line is used to compute new values.
3. Knowledge of python libraries such as NumPy, pandas and familiar with notebooks such as Jupyter Notebook.
4. Familiar with datasets such Sckit-learn, Kaggle, etc.
5. Intermediate knowledge of statistics and probability and Knowledge of how to clean and structure raw data to the desired format to reduce the time taken for decision-making.
Resources to Get started with Machine learning :
* Learn Basics of Machine learning in 10 Hours: https://www.youtube.com/watch?v=GwIo3gDZCVQ&t=5353s
*Roadmap to Master ML from Zero to Pro: https://drive.google.com/file/d/1kNsPhD4JMhS2F7kH9DvBb8wvt9XCJU8N/view?usp=sharing
* Machine Learning Full Course- Andrew Ng, Stanford University: https://www.youtube.com/playlist?list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN
*Datasets:
*Mathematics for Machine learning: https://www.coursera.org/specializations/mathematics-machine-learning
*Python for Beginners: https://www.youtube.com/watch?v=rfscVS0vtbw
Projects :
ML Prediction Model: https://github.com/SauravMukherjee44/Machine-Learning-Task-1
Email Classification ML Model: https://github.com/SauravMukherjee44/Email-Classification-Model
领英推荐
Linear Regression
from sklearn.linear_model import LinearRegression
model = Linear_Regression()
Logistic Regression
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
Naive Bayes
from sklearn.naive_bayes import GaussianNB,MultinomialNB
model = GaussianNB()
model = MultinomialNB()
Decision Tree
from sklearn.tree import DecisionTreeclassifier
model = DecisionTreeClassifier()
Support Vector Machine
from sklearn.svm import SVC
model = SVC(gamma=10)
K- Fold- Cross-Validation
from sklearn.model_selection import cross_val_score
Random Forest
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=10)
Algorithms in detail :
Linear Regression
It is a linear approach to a model that makes a linear line in the data by dividing them into two classes. Most effectively, the linear model is a good verse used for classification-based problems. It states the model with inputs and targets in the data.
Logistic Regression
It is a statistical model built on a logistic function to the binary dependent variable. It is quite similar to linear regression where it is used to predict categorical dependent variables.
Naive Bayes
It is a supervised learning algorithm that makes the best and fast predictions more accurate. And it’s set for classification problems, besides( my fav algorithm) it has two types of models Gaussian NB, Multinomial NB.
Decision Tree
In the Decision tree, the data is continuously split according to certain parameters. It has entities such as decision nodes & leaves. Based on these entities the decision would be made by the algorithm. Mostly used for both classification and regression problems.
Support Vector Machine
Support Vector Machine(SVM) works with a hyperplane between two data points of features and labels. Hyperplane states as decision boundaries for two different data points by initiating them with red and blue colors.
K- Fold- Cross-Validation
Cross-Validation estimates the performance of the machine learning algorithm. It splits data into “K” points such as k=2, k=4. These splitting points are called the folds invalidation algorithm.
Random Forest
Random forest is simpler and similar to decision trees whereas in random forest its a combination of decision trees to get more accurate and works faster. It has estimators in the sense number of decision trees to make up. It predicts best outcomes than individual trees.
K-means Clustering
It is an Un-Supervised Learning which has grouped data without labels. Using Clustering the algorithm carves up based on their features into groups with the help of “K” in the data points.
K-Nearest Neighbors
It is the finest algorithm in ML that deals with both classification and regression problems. KNN uses data to classify the latest data based on their similarity features in data. In simple words, it stores previous data as a record whenever the new data occurs it starts searching in old records of its data.
Hierarchical clustering
It is Un-Supervised learning with clusters analysis, which makes unlabeled data into clusters. Clusters in the sense it picks the bunch of related data points with their property into groups.
Principle Component Analysis
PCA is Un-Supervised Learning that is used as a dimensionality reduction for large datasets to reduce them into smaller but still consists of similar features that are in huge datasets. Most often, it is used to filter the noisy datasets for images such as Resolution, Compression.
Gradient Descent
It is the optimized algorithm for minimizing the cost of a function in the form of steepest descent by moving in opposite direction for?Global loss minimum.
Thanks for reading the whole article, Do connect with me on Socials mentioned below if any help is needed.
Student at SSVM school
3 年Love this article ??
18K+ Followers || IIT Jodhpur(Batch of 2024-26) || AI & Analytics Department, Cognizant || Best New Comer award winner || Expertise in Python, SQL, Power BI || Interested in GenAI.
3 年Carry on
Aspiring Software Engineer | Full Stack Developer | Open Source | Hacktoberfest (2021) | Campus Ambassador at Geeksforgeeks (2021-22) | 5 ? HackerRank | B.Tech-CSE 24'
3 年Great
Security Analyst @EY GDS | Former SDE Intern @SMIIT | Top 50 Samsung Solve for Tomorrow | Building @CodeIN Community | Google DSC Lead '22
3 年Do check out my article on Machine learning from the link below : https://www.dhirubhai.net/pulse/what-machine-learning-saurav-mukherjee