Auto ML
Ojash Shrestha
Senior Software Engineer | MS CS @ CSUF '24 | Prev @ Neplopers | MIT - HMS Bootcamp '19 | MPP Grad - Data Science '17
This article discusses Automated Machine Learning and reducing time to obtaining accurate insight with low code. From the basics of What Automated Machine Learning really is to its use cases, various algorithms, and numerous other services and technical subjects for the Azure Automated Machine Learning are explained here.
Automated Machine Learning
Automated Machine Learning (Auto ML) refers to automating the machine learning model development process which is mostly iterative and extremely time-consuming which enables developers, analysts, and data scientists to build highly scalable, efficient, and productive Machine Learning Models. Azure provides the feature of Auto ML which makes it easier to obtain production-ready Machine Learning Models without having to spend much time. Dozens of Models can be created and compared at the same time with the accurate ones to be decided for usage.
Use Cases of Automated Machine Learning
Auto ML is mostly used when we need to qualify a threshold of the accuracy of a metric target for our model which Azure Machine Learning enables us to do by training and tuning models using multitudes of different algorithms parallelly.
The Azure Auto ML has democratized the usage of Machine Learning tools irrespective of the experience of the individuals performing the solutions and helped by providing the end-to-end machine learning pipeline for ranges of problems to be solved.
Classification
Classification is one of the supervised learning approaches to classify the data into a specific category. The system to classify data into spam, fraud detection, object detection are usually classification problems.
Regression
Unlike Classification, regression is the process to build the relationship between variables to predict continuous value. It can be broken down into Linear and Logistic Regression.
To learn about these different regressions and their examples using R Programming, check the previous article Data Wrangling And Visualization In R.
Forecast
In Machine Learning, the forecast is mostly done on time-series with the prediction of numerous business-related metrics ranging from revenue, sales, demands of the customer, sales, and more. Thus, they are extensively used for forecasting demands, sales, and so on.
Example Scenario
Let us consider how to predict the cost of a home. This price would depend on various features about the house depending on the location, size, year build, materials used, amenities, and many more. For this, we would need a dataset of other homes and their pricing with its different features to train the model based on this data and learn from it.
Some of the examples of the dataset of the homes would hold the data about the features such as,
- Features
- Location
- Size
- Year Built
- Materials Used
- Amenities
- Total No of Rooms
- Total No of Baths
- Total No of Half Baths
- Total No of Car Parking Available
- Garden Size
Now, for Automated Machine Learning, we would choose the algorithm to run on. Depending upon the need, we would select an algorithm. Some of the examples of widely used ones are:
Algorithms
Gradient Boosted
Gradient Boosting typically uses decision trees to solve classification and regression-related problems. It is a technique for machine learning that is used to produce prediction models using the best possible next model in combination with previous models with the belief to reduce the prediction error.
Nearest Neighbor
The nearest neighbor algorithm helps solve the traveling salesman problem approximation. The k-nearest neighbor's algorithm is widely used for regression and classification problems.
Support Vector Machine (SVM)
The support vector machine is an algorithm which helps to analyze data for regression and classification analysis and is a supervised learning model. It can be used to solve both linear and non-linear classification problems. The Support Vector Clustering algorithm is an extension of SVM which is an unsupervised approach to categorize unlabeled data and is extensively used in the industry today.
Bayesian Regression
Bayesian linear regression uses statistical analysis of Bayesian inference to solve with the approach to linear regression. Unlike Linear Regression alone which uses point estimates value, Bayesian regression uses probability distribution for its approach.
Gradient Descent
Gradient Descent algorithm helps to find the local minimum of a differentiable function with its first-order iterative optimization technique. It uses the approach of taking iterative steps in the reverse direction of the gradient of the function at the current point as it is the direction of the steepest descent. It can be understood as an optimization algorithm which is based on the convex function. It occurs widely in the backpropagation phase while working with a neural network for minimizing the cost function.
LGBM
Light Gradient Boosted Machine Algorithm (LGBM) is widely used for classification, ranking, and many other machine learning tasks which is a high-performance gradient boosted framework that is based on decision tree algorithm and is fast, as it processes using Histogram based splitting, gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB).
Depending upon the parameters, we then calculate the final prediction value model. Some of the parameters can be,
- Parameters
- Criterion
- Loss
- Min Samples Split
- Min Samples Leaf
- Others
We know, for the average developer/ scientist, the model creation is typically a time–consuming task. Automated ML supports to runs all the models simultaneously for a specific time being and choose the best accurate model. This can be done in basically 3 steps.
Step 1 - Input
- Enter Data
- Define Goals
- Apply Constraints
Step 2 - Intelligent Test Multiple Models in Parallel
Step 3 - Output
- Optimized Model
How to use Azure Auto ML?
Azure Automated Machine Learning and ML Studio is a service that comes in Enterprise only unlike the basic premium version. You can use the Azure Pricing Calculator to help you find out the cost of the monthly bill.
References
- https://learning.postman.com/docs/sending-requests/visualizer/
- https://www.coursera.org/learn/machine-learning
- https://nlp.stanford.edu/IR-book/html/htmledition/support-vector-machines-the-linearly-separable-case-1.html
To Read the Full Article, Check it out at: https://bit.ly/3fwF7KS