A BEGINNER GUIDE ON FASTAPI, DOCKER AND HUGGINGFACE FOR SEAMLESS MACHINE LEARNING DEPLOYMENT
UWAYO Valentin
Data analyst| Msc degree in Quantitative discipline (Financial Engineering)
Introduction
Machine learning algorithms are a potent instrument in the current era of data-centricity. Nevertheless, the deployment of these algorithms for practical applications can pose difficulties, particularly for novices in the domain. FastAPI, a contemporary, swift, and highly productive web framework for crafting APIs with Python, stands out as an optimal selection for deploying machine learning algorithms. In this introductory manual, we’ll delve into the fundamentals of FastAPI and guide you progressively through the deployment of a machine learning algorithm.
FastAPI
FastAPI is a Python web framework specifically designed for building APIs. It combines the best features of modern web frameworks, making it both user-friendly and high-performance.
Some key benefits that make FastAPI suitable for model deployment compared to other frameworks like Streamlit and Gradio include:
Asynchronous Support
FastAPI’s asynchronous nature allows it to handle multiple requests simultaneously without blocking the main branch. This feature is valuable in situations where you need to respond to many requests quickly, such as real-time applications in the transportation industry (e.g., ride-hailing services) and beyond.
Scalability
FastAPI’s asynchronous capabilities enable it to handle a large number of concurrent requests without affecting performance. Whether it’s healthcare, finance, or e-commerce, FastAPI can seamlessly scale to meet the demands of various industries. Here are some examples to illustrate the scalability of FastAPI:
Healthcare:
In the healthcare industry, FastAPI’s scalability is crucial. Imagine a telemedicine platform integrated with machine learning, allowing doctors to predict patient health risks based on their medical history. For example, the platform could forecast the likelihood of a patient developing sepsis or other critical conditions. FastAPI, with its asynchronous support, ensures that real-time data input and prediction are possible even during high demand, offering instant health risk assessments across many hospitals and clinics without causing delays. FastAPI’s asynchronous nature allows it to handle multiple requests concurrently, ensuring that doctors and patients receive timely responses and predictions.
Finance
Imagine a dynamic stock trading platform where users actively buy and sell stocks throughout the trading day. FastAPI is pivotal in processing these transactions in real-time, ensuring smooth operations even during market volatility without compromising performance. In this stock trading scenario, machine learning steps in to predict stock price movements. Traders can seamlessly receive real-time predictions regarding whether a stock is likely to rise or fall. FastAPI’s exceptional concurrency handling capabilities guarantee that many users can access stock predictions without delays. This facilitates timely trading decisions and enhances the overall trading experience.
E-commerce:
In e-commerce, FastAPI can enhance e-commerce platforms by using machine learning to offer personalized product recommendations. It can analyze user activities, like browsing history and past purchases, to suggest products that match their preferences, creating a better shopping experience and boosting sales. FastAPI’s ability to work asynchronously means it can process data quickly and provide instant recommendations to numerous users simultaneously. This is especially valuable for e-commerce platforms with many customers shopping simultaneously, especially during peak seasons like Black Friday.
Transportation (Ride-Hailing):
During peak hours, when many users simultaneously use a ride-hailing app, FastAPI’s asynchronous support becomes even more crucial. As ride demand surges during rush hours, the app experiences a high volume of users requesting and taking trips. FastAPI excels in handling this increased traffic efficiently. It continuously updates and delivers real-time information, such as expected time of arrival (ETA) for drivers to passengers and dynamic ETAs for trip completion.
FastAPI’s asynchronous capabilities ensure the system can simultaneously process and respond to numerous user requests without noticeable lag. For example, it can provide thousands of passengers with individualized ETAs for their respective rides and drivers’ locations, all in real time. FastAPI’s quick response time and scalability make it suitable for addressing the high user volumes typical of peak hours in ride-hailing services. This results in a smooth, responsive, and user-friendly experience for everyone, even during the busiest times of the day.
Performance
FastAPI is one of the fastest web frameworks available. Its speed makes it suitable for real-time machine learning applications, as previously mentioned, providing instant responses to user requests.
Data Validation
FastAPI provides a simplified and automatic data validation and serialization approach using the Pydantic library, ensuring the data sent to the model is correct.
Data validation is a process of checking data to ensure it meets certain criteria e.g. Accuracy, completeness, and data type .
It helps in defining the structure of data and automatically validates incoming requests.
Beginner-Friendly and Pythonic
FastAPI is beginner-friendly, especially for those already familiar with Python, and it embraces Pythonic conventions, facilitating a smooth transition into web development for Python developers of all levels. If you’re new to the field, don’t worry — we’ll guide you through the basics.
FastAPI and HTTP
App Instantiation
In FastAPI, it all begins with creating an instance of the FastAPI class. This instance serves as the core of the application. You can customize it with various options, such as specifying the title, description, and version. Here's how you create the FastAPI application:
from fastapi import FastAPI
# Instatiante the FastAPI
app = FastAPI()
Defining Endpoints
FastAPI relies on HTTP (Hypertext Transfer Protocol) for communication. Within FastAPI, these communication points are referred to as “endpoints.” These endpoints act as HTTP request handlers, dictating how the API responds to various types of requests. In APIs, these “endpoints” are established to accommodate different types of requests. The primary HTTP request types are:
In FastAPI, endpoints are Python functions that you decorate with HTTP operation methods, specifying the path at which they are accessible. These functions are the heart of the API, as they determine how the application responds to different types of HTTP requests.
Let’s explore how to define endpoints for different types of HTTP requests:
Reading Data with GET Requests
The @app.get decorator is used to define an endpoint that handles GET requests. It's perfect for reading data from the server.
@app.get()
Creating Data with POST Requests
To handle POST requests for creating resources, use the @app.post decorator. Here's an example:
@app.post()
Updating Data with PUT Requests
To update an existing resource or create a new one, use the @app.put decorator. Example:
@app.put()
Deleting Data with DELETE Requests
For deleting resources, the @app.delete decorator is used. Here's an example:
@app.delete()
Endpoints are crucial in shaping an API and specifying how it interacts with clients. They define the core functionality and behaviour of the FastAPI application.
Data Models and Pydantic
FastAPI uses Pydantic models to define the data structures of an API, both for request (input) and response (output) data. Pydantic is a Python data validation and parsing library that makes it easy to define data models with validation rules. This ensures that the data coming into an API is valid but also generates interactive documentation for the API automatically
In the context of FastAPI, the Pydantic library does the data validation by allowing one to define the data structure using type hints, ensuring that incoming data meets the API’s defined structure.
Here’s an example code snippet from a sepsis prediction project:
from pydantic import BaseModel
class InputData(BaseModel):
PRG: int
PL: float
PR: float
SK: float
TS: int
M11: float
BD2: float
Age: int
# Define the output data structure using Pydantic BaseModel
class OutputData(BaseModel):
Sepsis: str
In the provided code, two Pydantic models are defined, InputData and OutputData, which specify the expected format of input data and the data structure to be returned in the API response (output data), respectively. These models will be further explained and utilized later in the FastAPI application for Sepsis prediction.
Let’s apply these concepts to a practical example: deploying a sepsis prediction model with FastAPI. We’ll go through each section of code step by step.
FastAPI Deployment for Sepsis Prediction
In the world of healthcare, early detection of life-threatening conditions can make the difference between life and death. Sepsis, a severe infection that can lead to organ failure, is one such condition where timely diagnosis is critical. In this article, I’ll take you on a journey through creating a user-friendly Sepsis Prediction API using FastAPI.
The Problem: Detecting Sepsis
Sepsis is a critical concern in healthcare since it’s a medical emergency where the body’s response to infection can lead to tissue damage, organ failure, and even death if not detected and treated in time.
Sepsis requires a quick and accurate diagnosis. To tackle this problem, we’ll leverage a machine learning model I had earlier trained in Part 1 (Link to Part 1).
However, it’s not just about the model; it’s also about making it accessible to healthcare professionals and those without a technical background.
The Solution: FastAPI
In Part 1, we focused on loading the data, conducting the EDA, feature engineering, training and evaluating the machine learning model, and exporting it with other key components. In Part 2, we will dive into setting up the FastAPI, Docker containerization, and deployment, making the sepsis prediction API accessible to the public.
Project Structure
Start by setting up a directory structure for the project. Here’s the structure I used:
├── sepsis_prediction (Project's Root Directory).
├── src
├── app.py (your FastAPI)
├── Dockerfile (Your Dockerfile for containerizing the FastAPI with Docker)
└── build.sh (File with instructions for building the container and running it)
└── model_and_key_components.pkl
├── venv (your virtual environment)
├── .gitignore
└── requirements.txt
Virtual Environment Activation
Before running the FastAPI application, create a virtual environment and install the necessary dependencies. From your project directory, run the following in your terminal:
Create your virtual environment:
python -m venv venv; venv\Scripts\activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
python3 -m venv venv; source venv/bin/activate; python -m pip install -q --upgrade pip; python -m pip install
This is how the code provided above creates your virtual environment:
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows:
venv\Scripts\activate
# On macOS and Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
Building the FastAPI Application
Create a FastAPI application in a file called app.py in the src directory. Then follow the following steps:
Importing Necessary Libraries
Our journey begins by importing the tools we’ll use in our project.
# Import the relevant libraries
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import pickle
import pandas as pd
These libraries include FastAPI for creating the API, Pydantic for defining data structures, and others for data processing.
Creating a FastAPI Application
A FastAPI application is like a canvas for your project. Think of it as the home for your prediction tool. In this step, we create it and define some basic information, such as the title, description, and version of our API.
app = FastAPI(
title="Sepsis Prediction API",
description="This FastAPI application provides sepsis predictions using a machine learning model.",
version="1.0"
The title and description provide an introduction to our API.
Step 3: Loading the Model and Key Components
Our superhero in this story is the trained machine learning model for predicting sepsis. We must load this model and its key components saved in Part 1. These had been saved as follows using Pickle:
# Create a dictionary to store the components
saved_components = {
'model': tuned_gb,
'encoder': label_encoder,
'scaler': scaler
}
# Save all components in a single pickle file
with open('model_and_key_components.pkl', 'wb') as file:
pickle.dump(saved_components, file)
Therefore, we open the treasure chest and retrieve the model, encoder, and scaler using Pickle again.
# Load the model and key components
with open('model_and_key_components.pkl', 'rb') as file:
loaded_components = pickle.load(file)
loaded_model = loaded_components['model']
loaded_encoder = loaded_components['encoder']
loaded_scaler = loaded_components['scaler']
Defining Input and Output Structures
We now create structures for our data. Think of this as creating a form to enter data and specifying the format of the result.
class InputData(BaseModel):
PRG: int
PL: float
PR: float
SK: float
TS: int
M11: float
BD2: float
Age: int
class OutputData(BaseModel):
Sepsis: str
The first class, InputData, is a Pydantic model representing the expected format of input data, which includes fields such as 'PRG,' 'PL,' 'PR,' 'SK,' 'TS,' 'M11,' 'BD2,' and 'Age.' Each field has a defined data type (e.g., int or float).
The second class, OutputData, defines the data structure that the API will return as a response. In this case, it has a single field, 'Sepsis,' with a string (str) data type.
Using Pydantic models ensures that the incoming data matches the expected format and that the outgoing data adheres to the specified structure, helping ensure data consistency and correctness in the FastAPI application.
领英推荐
Preprocessing Input Data
To make our predictions, we need to prepare the input data. We first check if any data needs encoding (in this case, not needed), then scale the numerical data and convert it back to a pandas DataFrame.
def preprocess_input_data(input_data: InputData):
# Encode Categorical Variables (if needed)
# All columns are numerical. No need for encoding
# Apply scaling to numerical data
numerical_cols = ['PRG', 'PL', 'PR', 'SK', 'TS', 'M11', 'BD2', 'Age']
input_data_scaled = loaded_scaler.transform([list(input_data.dict().values())])
return pd.DataFrame(input_data_scaled, columns=numerical_cols)
Making Predictions
This is where the magic happens. We use our loaded model to predict sepsis. It’s like the climax of our story.
def make_predictions(input_data_scaled_df: pd.DataFrame):
y_pred = loaded_model.predict(input_data_scaled_df)
sepsis_mapping = {0: 'Negative', 1: 'Positive'}
return sepsis_mapping[y_pred[0]]
Defining Endpoints
Endpoints in FastAPI are like doors that define how clients can interact with your application.
@app.get("/")
async def root():
# Endpoint at the root URL ("/") returns a welcome message with a clickable link
message = "Welcome to your Sepsis Classification API! Click [here](/docs) to access the API documentation."
return {"message": message}
@app.post("/predict/", response_model=OutputData)
async def predict_sepsis(input_data: InputData):
try:
input_data_scaled_df = preprocess_input_data(input_data)
sepsis_status = make_predictions(input_data_scaled_df)
return {"Sepsis": sepsis_status}
except Exception as e:
# Handle exceptions and return an error response
raise HTTPException(status_code=500, detail=str(e))
In our Sepsis Classification API, we have two primary endpoints: the “welcome” door/endpoint and the “prediction” door/endpoint. Here’s what they do:
Welcome Door (GET Request)
@app.get("/")
async def root():
# Endpoint at the root URL ("/") returns a welcome message with a clickable link
message = "Welcome to your Sepsis Classification API! Click [here](/docs) to access the API documentation."
return {"message": message}
The @app.get("/") decorator defines an endpoint that handles GET requests to the root URL ("/"). When a client, such as a web browser or another application, sends a GET request to the root URL, our FastAPI application responds with a welcome message. In this case, the message provides a clickable link that directs users to the API documentation. This is a read-only operation since it doesn't change the server's state.
Prediction Door (POST Request)
@app.post("/predict/", response_model=OutputData)
async def predict_sepsis(input_data: InputData):
try:
input_data_scaled_df = preprocess_input_data(input_data)
sepsis_status = make_predictions(input_data_scaled_df)
return {"Sepsis": sepsis_status}
except Exception as e:
# Handle exceptions and return an error response
raise HTTPException(status_code=500, detail=str(e))
The @app.post("/predict/") decorator defines an endpoint that handles POST requests to the "/predict/" URL. This endpoint is responsible for predicting sepsis based on the input data provided by the client.
Here’s what happens in this endpoint:
i. It receives the input data, which is a set of features needed for sepsis prediction, in the form of a POST request.
However, if any errors occur during this process, such as invalid input data or issues with the prediction, the endpoint raises an HTTPException with a status code of 500, indicating an internal server error, and provides details about the error.
These endpoints are the core functionality of your Sepsis Classification API, defining how it responds to different types of requests, and enabling clients to access the welcome message and obtain sepsis predictions.
Running Your FastAPI Application
This step is like turning on a light switch, and our application becomes alive.
if __name__ == "__main__":
import uvicorn
# Run the FastAPI application on the local host and port 8000
uvicorn.run(app, host="127.0.0.1", port=8000)
You can now access your Sepsis Prediction API at https://127.0.0.1:8000.
This FastAPI-powered tool has created a bridge between cutting-edge machine learning and real-world healthcare. It’s like having a guardian angel for early sepsis detection. And the best part? You don’t need to be a tech wizard to use it.
Let’s proceed to containerize our FastAPI with Docker.
Docker Containerization and Deployment
Docker allows us to package our application and its dependencies into a standardized unit known as a container. This makes it incredibly easy to deploy our application consistently across different environments. Before we jump into the code, let’s break down Docker for beginners.
Understanding Docker
Docker is a platform for developing, shipping, and running applications in containers.
Containers are lightweight and portable units that include everything your application needs to run, such as code, runtime, libraries, and system tools. This eliminates the classic “it works on my machine” problem, making deployments more reliable and efficient.
Components of a Docker Container
Before we examine our Dockerfile, let’s understand the essential components of a Docker container:
Dockerfile: This is like a recipe for building a Docker image. It specifies the base image, sets up the environment, and defines how your application is configured and run.
Docker Image: An image is a lightweight, stand-alone, executable package that includes everything needed to run a piece of software, including the code, a runtime, libraries, and environment variables.
Docker Container: A container is a running instance of an image. You can think of it as a lightweight, isolated environment where your application runs.
Now, let’s break down the Dockerfile for our FastAPI application.
Create a Dockerfile
In this step, I created a Dockerfile to define how to build a Docker image for the FastAPI application. Each line in the Dockerfile represents a specific command or an instruction.
# Use the official Python image as a parent image
FROM python:3.11.3-slim
# Set the working directory within the container
WORKDIR /app
# Copy your FastAPI application code into the container
COPY src/app.py /app
# Copy the requirements.txt file into the container
COPY requirements.txt /app
# Install the Python dependencies
RUN pip install -r /app/requirements.txt
# Expose port 7860 for the FastAPI application
EXPOSE 7860
# Define the command to run your FastAPI application
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860", "--reload"]
Dockerfile Code Explanation:
# Use the official Python image as a parent image
FROM python:3.11.3-slim
Using the “slim” variant of the base image in our Dockerfile is a strategic choice for several reasons:
Reduced Container Size: The “slim” variant is a stripped-down version of the base image, which means it contains only the essential components needed to run Python. This results in a much smaller container size. Smaller containers are quicker to build, transfer, and deploy. This is particularly important in scenarios where fast deployment and efficient resource usage are crucial.
Improved Security: By excluding unnecessary components, the “slim” variant reduces the potential attack surface in our container. This means fewer parts could be vulnerable to security threats. While using the official Python image provides a secure base, the “slim” variant takes an extra step to minimize potential risks.
Efficient Resource Usage: The lightweight nature of the “slim” variant ensures that your container consumes fewer resources, which is especially beneficial in resource-constrained environments. It’s a more efficient choice for running your FastAPI application when resources like memory and CPU are limited.
Faster Building and Deployment: Smaller containers can be built and deployed more quickly. Whether you’re building containers frequently during development or deploying them to a production environment, the reduced container size leads to faster processes overall.
# Set the working directory within the container
WORKDIR /app
# Copy your FastAPI application code into the container
COPY src/app.py /app
# Copy the requirements.txt file into the container
COPY requirements.txt /app
# Install the Python dependencies
RUN pip install -r /app/requirements.txt
# Expose port 7860 for the FastAPI application
EXPOSE 7860
# Define the command to run your FastAPI application
CMD ["uvicorn", "app:app", "--host", "127.0.0.1", "--port", "7860", "--reload"]
Let’s break down this specific CMD instruction:
Assuming your FastAPI application file was named "main.py" instead of "app.py," you would adjust this part of the CMD instruction to "main:app" to indicate the module name and the FastAPI instance within the "main.py" file.
Now that we’ve dissected our Dockerfile, we’re ready to proceed to the next steps of building and running the Docker container.
Building and Running the Docker Container
Now that we’ve defined our Dockerfile, it’s time to use it to build a Docker image and run a container from that image. Containerization offers tremendous advantages in ensuring your application runs consistently across different environments.
Building the Docker Container
We’ll use a script to simplify the build process. Here’s a breakdown of the commands within the script:
# Build the Docker container for the FastAPI application
docker build -t sepsis_fastapi -f src/Dockerfile .
Listing Docker Images
After the image is successfully built, you can list all the Docker images to verify that the image was created correctly
# List all Docker images
docker images
Running the Docker Container Locally
Once you’ve built the Docker image, you can run it as a Docker container using the following command:
# Run the Docker container locally
docker run -p 7860:7860--name sepsis_fastapi 85ae35950e1c
Listing Running Docker Containers
To check if your container is running, you can list all running Docker containers using the following command:
# List running Docker containers
docker ps
Now, the FastAPI application is up and running within a Docker container, ready for deployment. This containerization step is a significant milestone in ensuring your application runs consistently across different environments. Now, let’s test the FastAPI.
Test The FastAPI
Before Execution
After Execution
Based on the inputs I provided the API, the Sepsis prediction is Positive, indicating that the patient whose details were fed into the API has a high risk of developing Sepsis.
Conclusion
Right from the outset, we’ve set off on an illuminating voyage in the realm of machine learning implementation using FastAPI, Docker, and Hugging Face. Leveraging the ease and potency of FastAPI, we’ve constructed a sturdy API for instantaneous Sepsis forecasts. The containerization provided by Docker guarantees our application’s seamless operation, even in the most intricate environments. Furthermore, Hugging Face unlocks a universe of opportunities by providing a platform to disseminate and utilize what we’ve developed. As we bid adieu, bear in mind that the strength of this triumvirate can transform your approach to deploying machine learning models, democratizing AI for all, irrespective of their technical proficiency.
Enjoy your coding journey, and may your deployment ventures be prompt and prosperous!
Note: This article provides a high-level overview of the project. For detailed code and implementation, refer to the provided code snippets and the associated GitHub repository.
GitHub Repository:
Hugging Face:
||Microsoft Certified :Azure Data Scientist Associate ||Data Analyst||Machine Learning & AI|Quantitative Analysis ||Maths Specialist /Coach ||Data Engineer
1 年This is great work. Keep it up my brother