登录查看更多内容

Building a Machine Learning Pipeline – Modeling

Ankush Seth

CTO @ Mi Analyst | Helping businesses accelerate growth and efficiency with Gen AI

发布日期: 2019年9月4日

Welcome back everyone. Let’s dive into the Modeling aspect of the machine learning workflow. For those who missed Part 1 on Exploration and Data processing, it can be found here. The main goal of this phase is to determine what kind of prediction model will best suit the data at hand and the problem statement one is trying to solve.

Determining this is not an exact science and requires some amount of experimentation. However, I’ve found that starting the journey by answering a couple of questions (mentioned below) helps one narrow down the options fairly quickly. Each of these potential models will then need to be evaluated by going through a training and validation cycle.

Here are the couple of questions I recommend answering to kick-off the decision-making process: -

Is the output we are seeking easily achieved by models based on statistical classification or regression analysis? If so, will a simple model like Support Vector Machines suffice? Or do we need to go in the direction of neural networks like CNNs, RNNs, GANs, etc.?
Do we want to use already tuned and trained models or do we want to build our own and train it with our data? E.g. a NLP (natural language processing) model from AWS that is pre-trained on vast sums of data will probably provide a lower error rate than one that is custom and trained locally (assuming the data available is not at the same scale) . On the other hand, if you are looking to solve for a specific problem going custom by building your own algorithm or modifying one might be the way. A happy medium might be leveraging transfer learning and building on top of that.

Once we have responses to the above questions, we can come up with a list of potential candidates we would like to train and validate. Be mindful of common gotchas like over and under fitting when performing training. Validate the model by running the trained model on the test dataset and analyzing the results to determine how performant the model is. In the event the decision is to go with a custom model, the model designing aspect might be a major pre-cursor task.

Depending upon the test results one maybe ready to move to the next stage of deployment or as it happens many times the test results are not impressive, and one has to try tuning the hyper parameters or potentially go back to the drawing board. Since this whole process is very iterative in nature I recommend taking a lean approach to assessing models. For example, establishing training / test dataset limits, creating key metrics that are able to provide a good indication of how performant the candidate is. However, it should not be at the expense of thinking long term. The opportunity cost of switching from one model to another might be especially high if the training dataset is particularly large.

That’s it for now. Happy modeling. Next week we will go over the deployment phase.

要查看或添加评论，请登录

Ankush Seth的更多文章

The Invisible Threat: How Prompt Injection and Leakage Undermine Security in LLM Applications

2024年2月28日

The Invisible Threat: How Prompt Injection and Leakage Undermine Security in LLM Applications

Large Language Models (LLMs) and their capabilities have been in the limelight in recent times. Specifically…

1 条评论
Gemini Vs GPT-4 - Battle of the titans

2023年12月11日

Gemini Vs GPT-4 - Battle of the titans

GPT-4 has been the State of the Art (SOTA) model in recent times for a number of generative AI use-cases but now we…

2 条评论
Integrated Gradients — Interpreting the LLM decision making process

2023年10月11日

Integrated Gradients — Interpreting the LLM decision making process

Large Language Models have attracted a lot of attention in recent times. Through the likes of ChatGPT these models have…
Understanding Back Propagation in human terms

2023年9月27日

Understanding Back Propagation in human terms

Deep learning neural networks and their fundamental building block, the perceptron, serve as a mathematical model…
Building a Machine Learning Pipeline – Deployment

2019年9月23日

Building a Machine Learning Pipeline – Deployment

Welcome Back! Hope you enjoyed the previous two articles on building a machine learning pipeline (Part 1, Part 2 for…
Building a Machine Learning Pipeline – Exploration and Data Processing

2019年8月27日

Building a Machine Learning Pipeline – Exploration and Data Processing

In this three-part blog series, we are going to explore how to build a machine learning pipeline (defined below). Each…

See all articles

Building a Machine Learning Pipeline – Modeling

Ankush Seth

CTO @ Mi Analyst | Helping businesses accelerate growth and efficiency with Gen AI

Ankush Seth的更多文章

社区洞察

其他会员也浏览了

Future of HR from 2020: Embrace Machine Learning & Deep Learning

AI Engineer Journey – Overcoming the Learning Curve: How I Mastered AI Basics

Machine Learning

Demystifying Deep Learning: The Engine Behind AI's Power

The Importance of High-Quality Training Data for Building Machine Learning and Deep Learning Models

Unlocking the Power of Deep Learning: Start with Machine Learning First! ????

Advanced Techniques for Optimizing Ranking Models in Machine Learning

Artificial Intelligence is not Science Fiction anymore.

Why Deep Learning Excites Me

Advanced Concepts of Machine Learing

Ankush Seth的更多文章

The Invisible Threat: How Prompt Injection and Leakage Undermine Security in LLM Applications

Gemini Vs GPT-4 - Battle of the titans

Integrated Gradients — Interpreting the LLM decision making process

Understanding Back Propagation in human terms

Building a Machine Learning Pipeline – Deployment

Building a Machine Learning Pipeline – Exploration and Data Processing

社区洞察

其他会员也浏览了

Future of HR from 2020: Embrace Machine Learning & Deep Learning

AI Engineer Journey – Overcoming the Learning Curve: How I Mastered AI Basics

Machine Learning

Demystifying Deep Learning: The Engine Behind AI's Power

The Importance of High-Quality Training Data for Building Machine Learning and Deep Learning Models

Unlocking the Power of Deep Learning: Start with Machine Learning First! ????

Advanced Techniques for Optimizing Ranking Models in Machine Learning

Artificial Intelligence is not Science Fiction anymore.

Why Deep Learning Excites Me

Advanced Concepts of Machine Learing