登录查看更多内容

Utilizing the Model Builder and AutoML for creating Lead Decision and Lead Scoring model in Microsoft ML.NET

Miodrag Cekikj, PhD CSE

Transforming Businesses with Applied AI | R&D Lead Technical Consultant @ ?IWConnect | Microsoft MVP | Technical Trainer | Web3 & Blockchain Practitioner

发布日期: 2021年9月30日

Recently, I wrote an article explaining the utilization of the ONNX format in integrating the Scikit-learn lead scoring machine learning model into the .NET ecosystem. I described one possible way of deploying the Python-based regression model as Microsoft Azure Function. That is a procedure applicable for integrating the trained model as part of the Web API or Console Application as well. What I have mentioned there was the opportunity to use the approach for bridging the technical differences between the different data science and application development platforms, in this case targeting the .NET cross-platform framework. Talking about Microsoft .NET, which is the technology I am professionally working with daily, in this article I want to uncover the native machine learning potential of the framework, more specifically, the ML.NET.

Since I have already presented the lead scoring idea in .NET, I will proceed presenting the implementation of the lead decision solution as a continuation of the lead decision solution designed and implemented using the KNIME Platform. As mentioned in the article, it conceptually follows the same approach and supervised ML idea, differing only in the classification-based prediction strategy. Besides this, I will also cover and leverage the idea of Lead Scoring as part of the created model`s prediction evaluation.

* Note: This article's solution design and source code are simplified to emphasize the core concepts and integration strategy in general. Still, it is a fully functional approach for training, building, evaluating and implementing predictive decision-based/supervised driven models within real-life testing or production deployed prototypes and applications environments.

** Note: I will design and build the solution utilizing the ML.NET Model Builder powered by the automated machine learning, or AutoML, using the intuitive and user friendly graphical Visual Studio extension. The more detailed techniques for building data processing and transformation pipelines, customizing the train and test data splitting, applying the concept of cross-validation and interpretation of the model performance and evaluation are beyond the scope of the article and can be referenced via the ML.NET API.

What is ML.NET?

ML.NET represents an open-source and cross-platform machine learning framework that can incorporate and use the machine learning algorithms in .NET related applications. Providing support for various popular use cases, it is equipped for building different models in different business domains using the already stored application`s data. Its central concept is designed around the Model Builder, a tooling mechanism specifying the steps needed to transform the input data into a prediction. Complementary, it is also using the Automated ML, a concept known as AutoML, which wraps and simplifies the interface providing automatically generated code projects for describing the input data and consuming the models. In addition to this, the integration of machine learning is supported by another fundamental module, the ML.NET CLI. ML.NET, in general, is supporting training, building and evaluating machine learning models using the C# and F# programming languages.

Environment Preparation

Following the newest Microsoft trends and announcements, Visual Studio 2022 and the final (so far) version of the .NET 6 will be officially released in November 2021. Traditionally, they are releasing preview versions for community use in the context of shaping the major release at its finest. In terms of this, I am excited to proceed with the practical presentation using the Microsoft Visual Studio Community Preview edition (version 17.0.0, preview 4.1) and the current .NET 6 Preview version of the framework.

* Note: The prerequisite for using the ML.NET Model Builder are Visual Studio 2019 16.10.4 or later/.NET Core 3.1 SDK or later.

Creating the Solution

I will begin the demonstration from scratch, with creating new solution “LeadGeneration” consisting of single console application named “LeadDecision”, process where it is important to select the .NET 6 (Preview) as a target framework. Completion of this setup will result in creating empty C# console application following the new .NET 6 template.

ML.NET Extension

Installing and enabling the ML.NET Model Builder in Visual Studio can be configured using the Visual Studio Installer and modifying the current version installation accordingly. In general, the ML.NET component is placed under the Individual components, more precisely in the .NET section.

Additionally, the ML.NET Model Builder UI extension tool should be installed in the Extensions management area available on the main menu bar.

Dataset

Since this article will follow the design and implementation of the lead decision predictive ML model, I will use the advantage of the identical analyzed, processed and scaled dataset used in my previous article for building the model using the KNIME platform. In terms of this, the initial Lead Scoring raw dataset is publicly available on Kaggle.

ML.NET Model Builder Setup

As mentioned before, the Model Builder represents a very user-friendly graphical tool extension for managing the machine learning process in Visual Studio. Its main characteristic is the mbconfig configuration file, which is managing the current session and keeping track of the changes specific for each phases of ML model building. The Model Builder providing the complete ML experience can be added within the created project following the procedure presented below.

Adding the Machine Learning support will actually open the Model Builder details, where I will walk through all steps in order to build and evaluate the lead decision model.

Scenario Selection

Model Builder is coming equipped with a lot of different built-in scenarios for machine learning application. In fact, each scenario is mapped to a different learning approach, depending on the specific business domain use case. Since I am building lead decision predictive model, I will select the Data classification scenario, based on the classification related algorithms.?

Training Environment

As displayed, the Data classification scenario is only locally supported, meaning that the model training will be executed on my local machine. So, considering that the Azure and GPU training modes are not currently supported, the only valid selection here remains the Local environment with the power of the CPU.

Import the Data

There are two different options of importing the dataset, using file or through the data source from SQL Server instance. Taking in consideration that I have already exported the csv file from Jupyter Notebook, I will browse the local system path to import it.?

As it is presented in the screenshot, the result of the successful importing is the dataset preview followed by additional data configuration options. In terms of this, the next step is to select the ‘Converted’ as a label or target column, but also open the advanced data options in order to check and configure the other features. The configuration consists of setting all other columns as single numerical features, excluding the 'column1' which is not relevant for the model building (holding numerical information related to the specific row number).

* Note: Class 0 represents the ‘Not Converted’ leads, while Class 1 represents the ‘Converted’ leads. The ‘Converted’ column can also be aligned as a categorical one within the process of data processing - so that instead of working with integers (converted into strings), the categorical description can take place.

Train Model

The next step is dedicated to the train configuration where I will configure the time for training the model. In fact, this is an automatic process where the Model Builder is leveraging the benefits and flexibility of the AutoML to investigate and apply many different models with a wide range of parameters. So, overriding the default time interval of 10 seconds with specifying more time for training will uncover the possibility of exploring more models maximizing the chances of retrieving more accurate final model. According to this, I will set the time interval to 900 seconds (15 minutes). It is worth mentioning here that there is official Microsoft guideline regarding for the recommended time interval in accordance to the dataset size.

The training process and different models’ selection is available within the Output Window, where every model selection as well as iteration accuracy is presented.

In the end, when the training process is successfully completed, the Model Builder is a complete generating Experiment Results Summary which is in the format presented in the screenshot below.

Also, I can review the output of the process in the Train area, where the best performant model is presented.

领英推荐

Ad Dataset - Linear Regression

Harry Thapa 1 年前

What You Need to Know about BOAT | Camunda Monthly -…

Camunda 9 个月前

GenAI Weekly — Edition 6

Shuveb Hussain 11 个月前

In this particular scenario, the LightGbmMulti (LightGbmMulticlassTrainer Class) was evaluated as a best algorithm fit providing the best model accuracy.

Evaluate Model

Going further, I have the opportunity of evaluating the best model accuracy. Also, it is worth emphasizing the fact that I have a possibility of interacting with the model, meaning that I can provide some so far unseen data and immediately review the prediction.

Consume Model

After I am done the evaluation process, I am proceeding with the consuming model screen. As it is presented in the screenshot below, there is a code snippet for explicit model integration and consummation in the end application of interest. The end application can be the console application I already created, but I will also present how to integrate the model within the generated Web API (using the Add Web API solution options).

I will generate the Web API application as separate project within the solution named as LeadDecisionModel_API.

Next Steps

There are two additional possibilities as a wrap up of the Model Builder journey, the 'Deploy your model' and 'Improve the model' sections. In fact, they are currently implemented as a redirection buttons addressing the official documentation where more details related to the model improvement and deployment can be found.

Console Application touch

Taking in consideration the generated code snippet for integrating and consuming the model within the Model Builder`s Consume step, I will copy it within the Program.cs file.

* Note: The mbconfig configuration file is accessible even after the UI tool is closed.

After commenting the predefined console application template code and pasting the generated code from the Model Builder, I only need to reference the LeadDecision namespace where the actual model input class was generated.

In the context of this, utilizing the Visual Studio built-in debugger, I will start the application and review the type as well as the content of the result output object.

So, as it can be observed, the result output represents ModelOutput object, including the prediction and scores or probabilities of successful or not successful lead conversion. This explicitly means that we can use this approach for creating and analyzing machine learning models for predictive Lead Decision and Lead Scoring system.

Before integrating the model into the previously generated Web API, I want to present and explain the LeadDecisionModel.mbconfig structure.

The LeadDecisionModel.training.cs file is consisted of the generated machine learning model, including the selected algorithm and its configuration parameters, input features set, and the configuration for the predicted label. As presented in the screenshot below, everything is wrapped in a pipeline transformation referenced in the method for executing the pipeline and fitting the model.

On the other side, the LeadDecisionModel.consuption.cs file is consisted of the ModelInput class describing the input features, ModelOutput class containing the prediction output result (prediction and scores) as well as the PredictEngine covering the Predict functionality of the model. The last is generated using the MLContext?class and ITransformer interface, which can be utilized for more advanced custom ML solutions.?

Finally, the LeadDecisionModel.zip archive contains all files needed for creating and training the machine learning model. As it is visible from the previous screenshot, the path of the zip archive is referenced in the prediction engine construction, from where the supervised-based predictions are made.

Web API touch

Let`s recall that the Web API project was automatically generated using the Add to solution option within the Model Builder UI tool. So, it is created as a separate project within the LeadGeneration solution, automatically incorporating the identical files for supporting the integration of the created model. In general, the ML model for Lead Decision and Lead Scoring is now an integral part of the API project. Taking in consideration the .NET target framework, the Web API is now following new and lightweight minimal design, generated by the ML.NET Model Builder presented in the screenshot below.

I will start the API project and try to call the predict endpoint, since the MapPost functionality is defining an exposing its route and handler. In terms of this, I will proceed to use the Postman API Platform for preparing the endpoint url and the POST request body.

* Note: The binding url and port are part of the iisSettings configuration section within the launchSettings.json configuration file.?

The predict API call was successful, retrieving back the lead model`s prediction/decision and scores.

Final Words

In this article, I presented a detailed step by step process of building a Lead Decision and Lead Scoring predictive machine learning model leveraging the advantages of the Model Builder AutoML within the ML.NET framework. This machine learning framework is part of the .NET ecosystem and can be very easily utilized to build different use case scenarios related to specific business domain tasks and data. As mentioned within the referenced articles, this strategy is part of the broadest systematic approach of working and interpreting marketing and sales data to establish a more insightful and practical lead generation process.

Working natively within the .NET ecosystem, using the ML.NET and automated Model Builder UI tool, can be an advantage for the application developers feeling enthusiastic about integrating machine learning within the applications. Moreover, ML.NET represents a fully designed framework for the software developers experienced in machine learning and artificial intelligence.

I am using the ML.NET in combination with the Python-based Scikit-learn ML library, R Studio, and the KNIME platform and trying to maximize the full potential and flexibility of all different possibilities and built-in algorithms. From my point of view, there is not a single and best environment or library, but everything depends, and it is up to the concrete business use case and domain.

------------------------

Thank you for reading the article. I believe it is constructive and comprehensive in covering all aspects of building and evaluating machine learning models using the Microsoft ML.NET framework.

Currently, I am working on utilizing the machine learning algorithms in bioinformatics, more precisely in understanding the role of the microbiome in cancer diagnostics and therapeutics. Therefore, I am combining the potential of all previously mentioned platforms to explore the full potential of the data and the knowledge as well as the insights behind it.

Feel free to start a discussion and share your thoughts and experience in this regard.

I would be grateful if you take the time to comment, share the article and connect for further discussions and potential collaboration.

要查看或添加评论，请登录

Miodrag Cekikj, PhD CSE的更多文章

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.4

2024年11月15日

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.4

Leveling up existing RAG-based cloud-native solutions by using Azure AI Assistants All right, we come up to the wrap up…
RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.3

2024年11月14日

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.3

Intelligently index your data, optimize retrieval processes, and boost efficiency with Azure AI (Cognitive) Search…
RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.2

2024年11月13日

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.2

Building an Intelligent Document Processing Pipeline with Azure OpenAI and Azure AI (Cognitive) Services Let’s continue…
RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services - p.1

2024年11月11日

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services - p.1

Implementing Real-Time Speech Recognition, Translation, and Data Storage Using Azure Cognitive Services I get the…

1 条评论
Deploying your RAG-based GPT solutions using Microsoft Azure OpenAI

2023年12月4日

Deploying your RAG-based GPT solutions using Microsoft Azure OpenAI

Learn how to integrate and publish your intelligent GenerativeAI solution built with Azure OpenAI Service So far, we…

15 条评论
Azure OpenAI Studio - Chat Playground with GPT-3.5-turbo & GPT-4 models in a?nutshell

2023年11月30日

Azure OpenAI Studio - Chat Playground with GPT-3.5-turbo & GPT-4 models in a?nutshell

Everything you need to know to get started with Azure OpenAI Chat Playground As I announced in the previous post, here…

8 条评论
Crafting your customized ChatGPT with Microsoft Azure OpenAI Service

2023年11月27日

Crafting your customized ChatGPT with Microsoft Azure OpenAI Service

A Step-by-Step introductory guide for creating an Azure cloud native GenerativeAI solution specialized for…

9 条评论
Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 3

2022年12月27日

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 3

Bioinformatics Framework design and Methodology - Machine Learning Modelling Results for understanding the colorectal…

1 条评论
Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 2

2022年12月26日

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 2

Bioinformatics Framework design and Methodology - Machine Learning Modelling Results for the colorectal cancer…

1 条评论
Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 1

2022年12月25日

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 1

Introductory article - Bioinformatics Framework design and Methodology Overview After seven years of intensive and…

2 条评论

See all articles

Utilizing the Model Builder and AutoML for creating Lead Decision and Lead Scoring model in Microsoft ML.NET

Miodrag Cekikj, PhD CSE

Transforming Businesses with Applied AI | R&D Lead Technical Consultant @ ?IWConnect | Microsoft MVP | Technical Trainer | Web3 & Blockchain Practitioner

What is ML.NET?

Environment Preparation

Creating the Solution

ML.NET Extension

Dataset

ML.NET Model Builder Setup

Scenario Selection

Training Environment

Import the Data

Train Model

领英推荐

Evaluate Model

Consume Model

Next Steps

Console Application touch

Web API touch

Final Words

Miodrag Cekikj, PhD CSE的更多文章

社区洞察

其他会员也浏览了

AutoGen Frameworks in LLMOps: Automating JSON Flow Generation

How Coditas used ML to augment a sales acceleration platform

Build Oracle APEX Applications Using Generative AI Service

Qlik, Power BI, Tableau - current AI and ML capabilities.

???? Harnessing LLMs for Enterprise Applications

How Generative AI Unlocks the Next Level of Multi-System Integrations with Kafka and MuleSoft

Own the Unknown? with Tom Davenport and Ian Barkin

AI and data analytics creating interactive data

Issue 1, automate your developer workflows

#2 Unveiling the Power of Stats Mosaic: Exploration of Statistical Concepts through Interactive Visuals

What is ML.NET?

Environment Preparation

Creating the Solution

ML.NET Extension

Dataset

ML.NET Model Builder Setup

Scenario Selection

Training Environment

Import the Data

Train Model

领英推荐

Evaluate Model

Consume Model

Next Steps

Console Application touch

Web API touch

Final Words

Miodrag Cekikj, PhD CSE的更多文章

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.4

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.3

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services?-?p.2

RAG-ING Ahead: Next-Gen Cloud-Native Intelligence with Azure AI Studio and Cognitive Services - p.1

Deploying your RAG-based GPT solutions using Microsoft Azure OpenAI

Azure OpenAI Studio - Chat Playground with GPT-3.5-turbo & GPT-4 models in a?nutshell

Crafting your customized ChatGPT with Microsoft Azure OpenAI Service

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 3

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 2

Application of Machine Learning algorithms in modeling the role of the Microbiome in the Colorectal Cancer diagnosis and therapy - Part 1

社区洞察

其他会员也浏览了

AutoGen Frameworks in LLMOps: Automating JSON Flow Generation

How Coditas used ML to augment a sales acceleration platform

Build Oracle APEX Applications Using Generative AI Service

Qlik, Power BI, Tableau - current AI and ML capabilities.

???? Harnessing LLMs for Enterprise Applications

How Generative AI Unlocks the Next Level of Multi-System Integrations with Kafka and MuleSoft

Own the Unknown? with Tom Davenport and Ian Barkin

AI and data analytics creating interactive data

Issue 1, automate your developer workflows

#2 Unveiling the Power of Stats Mosaic: Exploration of Statistical Concepts through Interactive Visuals