登录查看更多内容

Day 24: Hands-on Practice - MLOps

Srinivasan Ramanujam

Founder @ Deep Mind Systems | Founder @ Ramanujam AI Lab | Podcast Host @ AI FOR ALL

发布日期: 2025年1月19日

+ 关注

Day 24: Hands-on Practice

Deploy and Monitor a Model Using TensorFlow Serving Set Up a Basic Monitoring Dashboard

Introduction

Deploying machine learning models is a critical step in the ML lifecycle. Beyond just building and training a model, deploying it into a production environment allows applications to utilize it in real-time or batch processes. TensorFlow Serving is a popular tool designed for serving TensorFlow models. It offers a flexible, high-performance serving system that allows seamless deployment and model management. Alongside deployment, monitoring ensures that the model operates effectively and provides the expected performance under various conditions.

This article will walk through deploying a TensorFlow model using TensorFlow Serving, setting up a basic monitoring dashboard, and integrating metrics to keep track of the deployed model's performance.

Part 1: TensorFlow Serving Overview

What is TensorFlow Serving?

TensorFlow Serving is a serving system specifically built for production ML use cases. Its key features include:

Model Management: Automatically handles multiple versions of models, enabling smooth upgrades or rollbacks.
High Performance: Designed for low-latency and high-throughput serving, suitable for real-time applications.
Flexibility: Supports models built with TensorFlow and other machine learning frameworks using custom plugins.

Part 2: Deploying a TensorFlow Model

Step 1: Train and Save a TensorFlow Model

Before deploying a model, you need a trained TensorFlow model saved in the SavedModel format. For demonstration, let’s use a basic example:

import tensorflow as tf

from tensorflow.keras import Sequential

from tensorflow.keras.layers import Dense

# Create a simple model

model = Sequential([

????Dense(64, activation='relu', input_shape=(4,)),

????Dense(32, activation='relu'),

????Dense(3, activation='softmax')? # For multi-class classification

])

# Compile the model

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Generate some dummy data

import numpy as np

X_train = np.random.rand(1000, 4)

y_train = np.random.randint(0, 3, 1000)

# Train the model

model.fit(X_train, y_train, epochs=5, batch_size=32)

# Save the model in the TensorFlow SavedModel format

model.save('my_model/')

The SavedModel format stores the model’s architecture, weights, and optimizer configuration, making it easy to deploy.

Step 2: Install and Set Up TensorFlow Serving

TensorFlow Serving can be installed using Docker for simplicity:

Pull the TensorFlow Serving Docker image: docker pull tensorflow/serving

Run TensorFlow Serving with the SavedModel: Assuming your model is saved at /path/to/my_model/, map this directory to the container and start TensorFlow Serving: docker run -p 8501:8501 --name=tf_serving \

??--mount type=bind,source=/path/to/my_model/,target=/models/my_model \

??-e MODEL_NAME=my_model -t tensorflow/serving

This command:

Verify the Deployment: TensorFlow Serving provides RESTful endpoints. You can check if the model is served by sending a request: curl -X POST https://localhost:8501/v1/models/my_model

Step 3: Make Predictions

You can send inference requests using tools like curl or Python libraries. Here's an example in Python:

import requests

import json

# Create dummy input data

input_data = {

????"signature_name": "serving_default",? # Default signature

????"instances": [[0.1, 0.2, 0.3, 0.4]]? # Example input

}

# Send a POST request to TensorFlow Serving

领英推荐

MLJAR AutoML

360DigiTMG 1 年前

Rules of Machine Learning: A Comprehensive Guide to…

Sanjay Kumar MBA,MS,PhD 3 个月前

Machine Learning - MLflow for managing the end-to-end…

Gaurav Pahuja 3 年前

url = "https://localhost:8501/v1/models/my_model:predict"

response = requests.post(url, json=input_data)

# Print the response

print("Predictions:", response.json())

Part 3: Monitoring the Deployed Model

Once the model is deployed, monitoring it is crucial to ensure smooth operation, detect issues, and track metrics like latency, throughput, and errors.

Step 1: Integrating Monitoring Metrics

TensorFlow Serving provides built-in support for monitoring via Prometheus. Metrics can include:

Request count: Number of inference requests.
Latency: Time taken to process requests.
Errors: Rate of errors in serving requests.

Step 2: Setting Up Prometheus

Install Prometheus: Download and install Prometheus from the official site.

Configure Prometheus: Create a configuration file (e.g., prometheus.yml) to scrape metrics from TensorFlow Serving: global:

??scrape_interval: 15s? # How often to scrape targets by default.

scrape_configs:

??- job_name: 'tensorflow_serving'

????static_configs:

??????- targets: ['localhost:8501']? # TensorFlow Serving endpoint

Run Prometheus: Start Prometheus using the configuration file: prometheus --config.file=prometheus.yml

Access Prometheus Dashboard: Prometheus runs on port 9090 by default. Open https://localhost:9090 in your browser.

Step 3: Visualize Metrics with Grafana

Prometheus metrics can be visualized using Grafana for better insights.

Install Grafana: Download and install Grafana from the official site.
Set Up a Prometheus Data Source:
Create a Dashboard:

Example Prometheus query for TensorFlow Serving latency:

histogram_quantile(0.95, sum(rate(http_server_requests_duration_seconds_bucket[1m])) by (le))

Part 4: Automating Monitoring Alerts

Set Up Alert Rules in Prometheus: Create alert rules in prometheus.yml. For example: alerting:

??alertmanagers:

????- static_configs:

????????- targets: ['localhost:9093']

rule_files:

??- "alerts.yml"

Example alert rule for high latency: groups:

??- name: tensorflow_serving_alerts

????rules:

??????- alert: HighLatency

????????expr: histogram_quantile(0.95, rate(http_server_requests_duration_seconds_bucket[5m])) > 1

????????for: 2m

????????labels:

??????????severity: warning

????????annotations:

??????????summary: "High latency detected"

??????????description: "95th percentile latency is greater than 1s for more than 2 minutes."

Integrate Alertmanager: Configure Alertmanager to send notifications (e.g., email, Slack) when alerts are triggered.

Conclusion

By following these steps, you’ve learned to deploy a TensorFlow model using TensorFlow Serving, monitor its performance with Prometheus, and visualize metrics in Grafana. This end-to-end approach ensures that your model deployment is not just functional but also robust, reliable, and scalable for production workloads.

Key takeaways:

TensorFlow Serving simplifies model deployment and management.
Monitoring with Prometheus provides essential insights into model behavior.
Dashboards like Grafana enhance visibility and help diagnose issues quickly.

With this foundation, you can further explore advanced topics like scaling TensorFlow Serving with Kubernetes, integrating A/B testing, or implementing real-time feedback loops to improve model performance.

Agentic AI

877 位关注者

要查看或添加评论，请登录

Srinivasan Ramanujam的更多文章

Why GenAI is the Future: Understanding the Buzz Behind Text-to-Text, Text-to-Video, and More

2025年3月20日

Why GenAI is the Future: Understanding the Buzz Behind Text-to-Text, Text-to-Video, and More

Why GenAI is the Future: Understanding the Buzz Behind Text-to-Text, Text-to-Video, and More In recent years, the tech…
Understanding the Difference Between AI and Agentic AI

2025年3月19日

Understanding the Difference Between AI and Agentic AI

Understanding the Difference Between AI and Agentic AI Artificial Intelligence (AI) has transformed industries by…
Why Data Science is Critical and Why You Should Join My Course

2025年3月18日

Why Data Science is Critical and Why You Should Join My Course

Why Data Science is Critical and Why You Should Join My Course In today's data-driven world, businesses rely heavily on…

1 条评论
Empowering Rural Students in Tamil Nadu Through AI Startups

2025年3月18日

Empowering Rural Students in Tamil Nadu Through AI Startups

Empowering Rural Students in Tamil Nadu Through AI Startups Artificial Intelligence (AI) is reshaping industries…
Why We Need Agentic AI Workflows in Our Daily Routines

2025年3月17日

Why We Need Agentic AI Workflows in Our Daily Routines

Why We Need Agentic AI Workflows in Our Daily Routines As artificial intelligence advances, it's becoming clear that…
Why Business Analytics Skills Are Crucial for Machine Learning Professionals

2025年3月16日

Why Business Analytics Skills Are Crucial for Machine Learning Professionals

Why Business Analytics Skills Are Crucial for Machine Learning Professionals In today's data-driven world, machine…
Why Learning Agentic AI Now is Crucial for Career Growth

2025年3月12日

Why Learning Agentic AI Now is Crucial for Career Growth

Why Learning Agentic AI Now is Crucial for Career Growth The rapid evolution of artificial intelligence is redefining…
Master AI & ML: Enroll in My Comprehensive Course Today

2025年3月11日

Master AI & ML: Enroll in My Comprehensive Course Today

Artificial Intelligence (AI) and Machine Learning (ML) are transforming industries, driving innovation, and creating…
Empowering the Next Generation: Srinivasan Ramanujam’s Hands-On Agentic AI Training

2025年3月10日

Empowering the Next Generation: Srinivasan Ramanujam’s Hands-On Agentic AI Training

Empowering the Next Generation: Srinivasan Ramanujam’s Hands-On Agentic AI Training Introduction The world of…
India’s AI Boom: 2.3 Million Jobs by 2027 & The Urgent Need for Reskilling

2025年3月10日

India’s AI Boom: 2.3 Million Jobs by 2027 & The Urgent Need for Reskilling

India’s AI Boom: 2.3 Million Jobs by 2027 & The Urgent Need for Reskilling India is witnessing an unprecedented surge…

See all articles

Day 24: Hands-on Practice - MLOps

Srinivasan Ramanujam

Founder @ Deep Mind Systems | Founder @ Ramanujam AI Lab | Podcast Host @ AI FOR ALL

Day 24: Hands-on Practice

Introduction

Part 1: TensorFlow Serving Overview

What is TensorFlow Serving?

Part 2: Deploying a TensorFlow Model

Step 1: Train and Save a TensorFlow Model

Step 2: Install and Set Up TensorFlow Serving

Step 3: Make Predictions

领英推荐

Part 3: Monitoring the Deployed Model

Step 1: Integrating Monitoring Metrics

Step 2: Setting Up Prometheus

Step 3: Visualize Metrics with Grafana

Part 4: Automating Monitoring Alerts

Conclusion

Agentic AI

877 位关注者

Srinivasan Ramanujam的更多文章

社区洞察

其他会员也浏览了

How to build an ML platform + other resources

Employee Spotlight: From AV to AI: Kevin Joseph on his Unexpected Career Journey to Engineering

The Hundred-Page Machine Learning Book Book?Review

? I made a pun about the wind but it blows.

The Hottest Tools in Machine Learning and Data Science in 2024 (Part 2)

Automating Machine Learning (AutoML) Selection Criteria and Theoretical Principles

Iris Predicting Chatbot on Telegram powered by DialogFlow

Machine Learning 6:'Classification' Day 2

Support Vector Machines (SVM) in Plain English

List of Top 10 Algorithms Used in Machine Learning Models

Day 24: Hands-on Practice

Introduction

Part 1: TensorFlow Serving Overview

What is TensorFlow Serving?

Part 2: Deploying a TensorFlow Model

Step 1: Train and Save a TensorFlow Model

Step 2: Install and Set Up TensorFlow Serving

Step 3: Make Predictions

领英推荐

Part 3: Monitoring the Deployed Model

Step 1: Integrating Monitoring Metrics

Step 2: Setting Up Prometheus

Step 3: Visualize Metrics with Grafana

Part 4: Automating Monitoring Alerts

Conclusion

Agentic AI

877 位关注者

Srinivasan Ramanujam的更多文章

Why GenAI is the Future: Understanding the Buzz Behind Text-to-Text, Text-to-Video, and More

Understanding the Difference Between AI and Agentic AI

Why Data Science is Critical and Why You Should Join My Course

Empowering Rural Students in Tamil Nadu Through AI Startups

Why We Need Agentic AI Workflows in Our Daily Routines

Why Business Analytics Skills Are Crucial for Machine Learning Professionals

Why Learning Agentic AI Now is Crucial for Career Growth

Master AI & ML: Enroll in My Comprehensive Course Today

Empowering the Next Generation: Srinivasan Ramanujam’s Hands-On Agentic AI Training

India’s AI Boom: 2.3 Million Jobs by 2027 & The Urgent Need for Reskilling

社区洞察

其他会员也浏览了

How to build an ML platform + other resources

Employee Spotlight: From AV to AI: Kevin Joseph on his Unexpected Career Journey to Engineering

The Hundred-Page Machine Learning Book Book?Review

? I made a pun about the wind but it blows.

The Hottest Tools in Machine Learning and Data Science in 2024 (Part 2)

Automating Machine Learning (AutoML) Selection Criteria and Theoretical Principles

Iris Predicting Chatbot on Telegram powered by DialogFlow

Machine Learning 6:'Classification' Day 2

Support Vector Machines (SVM) in Plain English

List of Top 10 Algorithms Used in Machine Learning Models