登录查看更多内容

Sharing your Machine Learning models ?

Varun Lobo

Data Scientist | Automotive Engineering | Analytics | Agile | Python | SQL | Data Science

发布日期: 2023年5月22日

A lot of time and effort is spent on cleaning the dataset and selecting the right model, then fine-tuning the hypermeters to give the desired success metrics. I've always wondered what next ? what is the best way to share a ML model once it's been trained and tested ? What is the widely accepted standard in different industries ?

I recently came across the library called 'pickle'. As per the documentation, "The?pickle?module implements binary protocols for serializing and de-serializing a Python object structure.?“Pickling”?is the process whereby a Python object hierarchy is converted into a byte stream, and?“unpickling”?is the inverse operation, whereby a byte stream is converted back into an object hierarchy."

In simpler terms, you can export your model into a file with extension ".(dot)pkl" which can be shared with others. To use this library, you first import it.

import pickle

Next step is to open a file called 'model.pkl' with an intention to 'write' in binary (i.e. wb) and use pickle.dump our model into the file.

pickle.dump(model, open('model.pkl', 'wb'))

This will create a file called 'model.pkl' in your root directory which can be shared with others.

To load your model from a pickle file, use pickle.load. You can now pass any argument (i.e: X_test) to get a prediction from the model that was un-pickled.

pickled_model = pickle.load(open('model.pkl', 'rb')
pickled_model.predict(X_test))

For the entire code on how to split the dataset into training and testing sets, creating the model, generating predictions and saving the model into a pickle file, here is my github repo for reference.

Advait Raje

Getting more people on bikes using Data, Northwestern Alumni

1 年

Pickle and Un-Pickle. Gives me a Chuckle!!

1 次回应

要查看或添加评论，请登录

Varun Lobo的更多文章

Regression Analysis: The Backbone of Machine Learning

2025年1月22日

Regression Analysis: The Backbone of Machine Learning

Ever wondered how machines learn to predict future trends or make personalized recommendations? It all starts with a…
BERT Embeddings: The What, Why, and How

2024年12月26日

BERT Embeddings: The What, Why, and How

Natural Language Processing (NLP) is fundamentally about understanding text, and embeddings are at the heart of this…
Understanding BERT (Bidirectional encoder representations from transformers ) Tokenization: The Why and How #NLP #Python #ML

2024年12月23日

Understanding BERT (Bidirectional encoder representations from transformers ) Tokenization: The Why and How #NLP #Python #ML

Tokenization is a foundational step in Natural Language Processing (NLP), and BERT has taken it to another level with…
Affine Transformation Using OpenCV: Simplifying Image Manipulation #ComputerVision #Python

2024年10月3日

Affine Transformation Using OpenCV: Simplifying Image Manipulation #ComputerVision #Python

If you're working with images, sooner or later, you'll encounter the need to transform them—rotate, scale, translate…

1 条评论
The Hidden Half of Machine Learning: Why Maintenance and Data Refresh Matter

2024年6月19日

The Hidden Half of Machine Learning: Why Maintenance and Data Refresh Matter

In the fast-paced world of data science and machine learning (ML), the spotlight often shines on the creation and…

1 条评论
The Crucial Role of Optimization in Machine Learning: Unveiling the Engine Behind Efficiency

2024年4月10日

The Crucial Role of Optimization in Machine Learning: Unveiling the Engine Behind Efficiency

In the ever-evolving landscape of artificial intelligence, machine learning stands as a cornerstone technology driving…
Harnessing the Power of Regex in Python for String Parsing and Web Scraping

2023年9月26日

Harnessing the Power of Regex in Python for String Parsing and Web Scraping

In today's data-driven world, extracting valuable information from text data and web pages is a fundamental task for…
Unlocking Insights with Conditional Probability in Data Science

2023年9月5日

Unlocking Insights with Conditional Probability in Data Science

In the ever-evolving landscape of data science, one powerful tool that often goes underappreciated is conditional…
What is Docker? How to create a Docker image and execute an application within a container ?

2023年5月15日

What is Docker? How to create a Docker image and execute an application within a container ?

What is Docker? Docker is a platform as a service product that uses an OS level virtualization of your application to…
Using Natural Language Toolkit (NLTK) in python to analyze text sentiments

2023年4月26日

Using Natural Language Toolkit (NLTK) in python to analyze text sentiments

The Natural Language Toolkit or more commonly called as the NLTK is a collection of libraries that allow easy…

See all articles

Sharing your Machine Learning models ?

Varun Lobo

Data Scientist | Automotive Engineering | Analytics | Agile | Python | SQL | Data Science

Varun Lobo的更多文章

社区洞察

其他会员也浏览了

Data Science #20

Data Science #4

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

LangGraph: A Quick Start

Framework Python Machine Learning

Time Complexity of an Algorithm - Part 2

Tackling LeetCode Problem 2550: Count Collisions of Monkeys on a Polygon

How to train your own custom model with Tensorflow object detection API and deploy it into Android with TF Lite

Graph Theory 101 - Part:8 - Multilayer & Multiplex Networks

Time Series Episode 4: Can you trust Auto-ARIMA?

Varun Lobo的更多文章

Regression Analysis: The Backbone of Machine Learning

BERT Embeddings: The What, Why, and How

Understanding BERT (Bidirectional encoder representations from transformers ) Tokenization: The Why and How #NLP #Python #ML

Affine Transformation Using OpenCV: Simplifying Image Manipulation #ComputerVision #Python

The Hidden Half of Machine Learning: Why Maintenance and Data Refresh Matter

The Crucial Role of Optimization in Machine Learning: Unveiling the Engine Behind Efficiency

Harnessing the Power of Regex in Python for String Parsing and Web Scraping

Unlocking Insights with Conditional Probability in Data Science

What is Docker? How to create a Docker image and execute an application within a container ?

Using Natural Language Toolkit (NLTK) in python to analyze text sentiments

社区洞察

其他会员也浏览了

Data Science #20

Data Science #4

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

LangGraph: A Quick Start

Framework Python Machine Learning

Time Complexity of an Algorithm - Part 2

Tackling LeetCode Problem 2550: Count Collisions of Monkeys on a Polygon

How to train your own custom model with Tensorflow object detection API and deploy it into Android with TF Lite

Graph Theory 101 - Part:8 - Multilayer & Multiplex Networks

Time Series Episode 4: Can you trust Auto-ARIMA?