登录查看更多内容

Machine Learning Blog – 9

Mahtab Syed

Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle

发布日期: 2022年10月7日

Machine Learning using 3 ways - Full code vs. No Code vs. Automated ML

I have been coding my models using Open-source technologies (Python, Pandas, NumPy and matplotlib, Scikit-learn and TensorFlow) in a Jupyter notebook on Google Colab using CPU / GPU. And now I am trying to make an enterprise grade application using MLOps (Azure Cloud, Azure DevOps and MLflow)

I had heard of "No Code" and "Auto ML" and I though let's give it a try with same data and compare accuracy of prediction against "Full code" where we have full control of the model.

Full code (above stack all my code using known algorithms) vs
No Code (Azure ML Designer) vs
Auto ML (Azure Automated ML)

Data

Bulldozers Regression Kaggle problem
Data has 412,698 rows and 104 columns which is a good size
Many columns had more than 50% data missing, Date was bundled as one column, and there are few numerical and most object(string) columns which had to be converted to categories
After Data Transformation and Feature Engineering from this dataset (https://www.kaggle.com/c/bluebook-for-bulldozers/data ) the transformed data set is created which is used as input to training (the Github code includes this transformation part)

ML Model?- Trained a Regression model using 3 ways

Full code (Python, scikit-learn with XGBRegressor)
No Code(Azure ML Designer with 2 models)
Auto ML(Azure ML)

1. Full code (Scikit-learn with XGBRegressor)

GitHub code here (https://github.com/mahtabsyed/Machine-Learning-Full-code-vs-No-Code-vs-Automated-ML/blob/main/Kaggle_Bulldozers_Regression.ipynb)
Python code
XGBRegressor
Using kFolds for cross validation
Hyperparameter tuning using Optuna (The model without Hyperparameter tuning proved to be better) - check the code (link above)
Evaluation - # BEST RMSE and MSE SO FAR!
# MAE: 57.002996
# RMSE: 258.555955

领英推荐

Issue #196 - THE ML ENGINEER ??

Alejandro Saucedo 2 年前

Explainable ML models with SHAP

Patrick Nicolas 1 年前

Issue #192 - THE ML ENGINEER ??

Alejandro Saucedo 2 年前

2. No Code(Azure ML Designer )

Models cheat sheet https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet?

Azure Machine Learning Pipeline - Model 1 and 2
Data - same
Model - Boosted Decision Tree Regression
Evaluation - Not good compared to my full code XGBBoostRegressor which is ~50 times better
# MAE: 5731.167302
# RMSE: 8571.87339

3. Automated ML (Azure ML)

This is quite easy to use
Specify the data and the Label
Specify the Compute Cluster and cross validation method (kFold)
And it identifies as a Regression task
Runs for quite a long time (about 6 hours) - To note : I provisioned CPU Compute Cluster and not GPU
Evaluation - Both MAE and RMSE are worse than No Code and quite poor compared to Full Code

So, for now (using vanilla model training) Full code wins… ??

Melbourne, 07 Oct 2022

Rebecca Vaksman

I bring Hiring, Talent Management, People Experience, Learning, Growth & Engagement together in the tech space ??

2 年

POV: When your people managers are constantly learning, experimenting, empowering their teams and keeping up with the industry... ??????

1 次回应

Miriam P.

Communications | Culture | Marketing

2 年

How could I resist reading this Mahtab Syed - cute dogs!

1 次回应

查看更多评论

要查看或添加评论，请登录

Mahtab Syed的更多文章

AI Agents or Agentic Systems

2025年3月10日

AI Agents or Agentic Systems

In the new year 2025 we see everyone talking about “Agents” or Agent like systems called “Agentic Systems”. I recently…

1 条评论
Develop your career in AI in 2025

2023年12月27日

Develop your career in AI in 2025

The hype of AI, especially in 2023 and continuing in 2024 and now in 2025, has created a supply of various courses. And…

1 条评论
Generative AI - Learnings 2023

2023年12月21日

Generative AI - Learnings 2023

This year 2023 has been the year of Generative AI using Large Language Models both closed source and open source. Like…

2 条评论
On Emotional Intelligence

2023年10月3日

On Emotional Intelligence

From my old archives - published on Tue 02 Nov 2010 in https://mahtabsyed.blogspot.

1 条评论
What is Data Governance? And why is it necessary especially now?

2023年3月26日

What is Data Governance? And why is it necessary especially now?

With the advent of Machine Learning and Artificial Intelligence for Predictions (Business metrics like Inventory…
Its end of year again… And I have no new year resolutions…

2022年12月31日

Its end of year again… And I have no new year resolutions…

Its 31 Dec 2022, an end of a year again… And I am quite happy and contented. ?? I have a clear vision of what I will do…

3 条评论
Winning with life which keeps throwing new challenges every day...

2022年3月27日

Winning with life which keeps throwing new challenges every day...

I had written this self care tip few months back which I thought its better to be published as an article..

2 条评论
The Silence within

2022年2月7日

The Silence within

Its peak winter in Melbourne and early morning of Wed 29 May 2019, and so far it’s the coldest day this year. I am at…
This year 2021… was in the trenches of worries

2022年1月1日

This year 2021… was in the trenches of worries

This year 2021… was in the trenches of worries due to Covid lockdowns, number of daily cases, economic slowdown…

1 条评论
Machine Learning Blog – 8

2021年11月20日

Machine Learning Blog – 8

Multi-Layer Stacking Ensemble and Optuna Hyperparameter Tuning In this blog I will illustrate and link to the code of a…

1 条评论

See all articles

Machine Learning Blog – 9

Mahtab Syed

Data and AI Leader | AI Solutions | Cloud Architecture(Azure, GCP, AWS) | Data Engineering, Generative AI, Artificial Intelligence, Machine Learning and MLOps Programs | Coding and Kaggle

Machine Learning using 3 ways - Full code vs. No Code vs. Automated ML

I had heard of "No Code" and "Auto ML" and I though let's give it a try with same data and compare accuracy of prediction against "Full code" where we have full control of the model.

Data

ML Model?- Trained a Regression model using 3 ways

1. Full code (Scikit-learn with XGBRegressor)

领英推荐

2. No Code(Azure ML Designer )

3. Automated ML (Azure ML)

So, for now (using vanilla model training) Full code wins… ??

Mahtab Syed的更多文章

社区洞察

其他会员也浏览了

Docker and Kubernetes for Data Science

Issue #166 - THE ML ENGINEER ??

Issue #171 - THE ML ENGINEER ??

The Ultimate guide to AI, Data Science & Machine Learning, Articles, Cheatsheets and Tutorials ALL in one place

Emerging Ecosystem: Data Science and Machine Learning Software, Analyzed

Comprehensive Machine Learning Solution

Exploring Scikit-Learn in 10 Examples

Fine-Tuning LLaMA2 with Alpaca Dataset Using Alpaca-LoRA

The Ultimate Roadmap to Becoming a Data Scientist

Decision Tree: Building Machine Learning Model

Machine Learning using 3 ways - Full code vs. No Code vs. Automated ML

I had heard of "No Code" and "Auto ML" and I though let's give it a try with same data and compare accuracy of prediction against "Full code" where we have full control of the model.

Data

ML Model?- Trained a Regression model using 3 ways

1. Full code (Scikit-learn with XGBRegressor)

领英推荐

2. No Code(Azure ML Designer )

3. Automated ML (Azure ML)

So, for now (using vanilla model training) Full code wins… ??

Mahtab Syed的更多文章

AI Agents or Agentic Systems

Develop your career in AI in 2025

Generative AI - Learnings 2023

On Emotional Intelligence

What is Data Governance? And why is it necessary especially now?

Its end of year again… And I have no new year resolutions…

Winning with life which keeps throwing new challenges every day...

The Silence within

This year 2021… was in the trenches of worries

Machine Learning Blog – 8

社区洞察

其他会员也浏览了

Docker and Kubernetes for Data Science

Issue #166 - THE ML ENGINEER ??

Issue #171 - THE ML ENGINEER ??

The Ultimate guide to AI, Data Science & Machine Learning, Articles, Cheatsheets and Tutorials ALL in one place

Emerging Ecosystem: Data Science and Machine Learning Software, Analyzed

Comprehensive Machine Learning Solution

Exploring Scikit-Learn in 10 Examples

Fine-Tuning LLaMA2 with Alpaca Dataset Using Alpaca-LoRA

The Ultimate Roadmap to Becoming a Data Scientist

Decision Tree: Building Machine Learning Model