Mastering Observability with OpenTelemetry and Grafana for FastAPI Applications
Opentelemetry Dashboard - FastAPI app

Mastering Observability with OpenTelemetry and Grafana for FastAPI Applications

Introduction

In the world of distributed systems, ensuring smooth application performance and troubleshooting issues is a complex but essential task. Over the past year, I delved into two powerful open-source tools—OpenTelemetry and Grafana—and explored how they simplify monitoring, debugging, and gaining insights into applications. This article outlines the key takeaways, implementation details, and benefits of using these tools, specifically with FastAPI applications, supported by real-world outputs from my setup.


What is OpenTelemetry?

OpenTelemetry is an open-source observability framework that provides standardized tools for collecting, processing, and exporting telemetry data. It focuses on three key pillars of observability:

  • Traces: Capture the execution path of a request across services, offering end-to-end visibility and helping to pinpoint latency issues.
  • Metrics: Quantitative measurements like CPU usage, memory consumption, request rates, and error rates. These facilitate monitoring the health and performance of applications.
  • Logs: Timestamped records of discrete events within applications that are crucial for diagnosing issues and understanding behavior.


OpenTelemetry Workflow

OpenTelemetry simplifies observability through its structured workflow:

  1. Instrumentation: Use OpenTelemetry APIs/SDKs to collect data from applications. It supports automatic instrumentation for many libraries and frameworks.
  2. Collection: The OpenTelemetry Collector receives telemetry data from applications.
  3. Processing: The collector processes the data, applying transformations, filtering, and aggregation.
  4. Exporting: Processed data is exported to backends like Prometheus, Zipkin, Jaeger, or cloud-based platforms for storage and visualization.


Setting Up OpenTelemetry and Grafana for FastAPI

Three pillars of observability on?Grafana

Step 1: Install Required Python Modules

To set up OpenTelemetry with FastAPI, install the following Python modules:

  • opentelemetry-api
  • opentelemetry-sdk
  • opentelemetry-exporter-otlp
  • opentelemetry-instrumentation-fastapi
  • opentelemetry-instrumentation-requests
  • prometheus_client
  • opentelemetry-exporter-zipkin

Step 2: Configure Data Sources

  • Prometheus for Metrics: Download the latest Prometheus tar file(Prometheus Installation), extract it, and run Prometheus. Create a Prometheus Configuration file – ‘prometheus.yml’. Navigate to ‘https://localhost:9090’ to ensure Prometheus is running.
  • Zipkin for Traces: Run Zipkin Using Docker(Zipkin). Navigate to ‘https://localhost:9411’ to ensure Zipkin is running.
  • Setup Loki and Promtail for Logs: Download Loki(Loki Installation) and Promtail(Promtail Installation). Configure them using loki-config.yml and promtail-config.yml files. Verify Loki at https://localhost:3100/metrics.

Step 3: Set Up the OpenTelemetry Collector

  • Download the OpenTelemetry Collector binary(Otel Collector) and run it.
  • Create a configuration file (otelcol-config.yml) to connect to data sources and export telemetry data.
  • Run the executable file with the config file.
  • This config file contains details of all the above data source endpoints so that it can send the respective data to these endpoints

Step 4: Instrument Your FastAPI Application

  • Modify your FastAPI app to expose metrics and traces using OpenTelemetry.
  • Write the necessary code to collect metrics and traces for your APIs.
  • Since my FastAPI application runs on a Linux system and is written in Python, I utilized OpenTelemetry's Python modules to configure it as stated in step 1. For Angular-based projects, OpenTelemetry provides Angular-specific modules that can be used for seamless integration.
  • These are the Angular-specific OpenTelemetry modules:

npm install @opentelemetry/api @opentelemetry/sdk-trace-web @opentelemetry/instrumentation-xml-http-request @opentelemetry/instrumentation-fetch
npm install @opentelemetry/instrumentation-document-load @opentelemetry/instrumentation-user-interaction        

Step 5: Set Up Grafana for Visualization

  • Download and install the latest version of Open source(OSS) Grafana and run it as a service.
  • The initial username and password are ‘admin’ by default.
  • Navigate to ‘https://localhost:3000’ to ensure Grafana is running.
  • Install additional plugins like ‘Infinity-apis’ if you want to interact with the Fastapi app endpoints directly.
  • Set up data sources by assigning their respective URLs and verify the connection to ensure it's active. This can be easily done through the Grafana user interface. Once configured, Grafana will automatically generate YAML configuration files (e.g., grafana-datasources.yml) that document all the connected data sources.
  • Create dashboards.

YAML Configuration for Observability Setup:

These YAML configurations are vital to connecting the various components of the observability stack, ensuring that logs, metrics, and traces are collected, processed, and visualized effectively. While I cannot share the exact configurations due to sensitive IP addresses and port information, the process involves structuring each file according to the requirements of the respective tool. To ensure seamless observability, I carefully configured the following YAML files for my setup:

  1. Loki Configuration (loki-config)
  2. Promtail Configuration (promtail-config.yml)
  3. Prometheus Configuration (prometheus.yml)
  4. Otel Collector Configuration (otelcol-config.yml)

Below is a glimpse of how I configured Loki, Promtail, and Otel collector YAML files:

otelcol-config.yml
loki-config
promtail-config.yml

Real-World Outputs from My Observability Setup

1. System Health and Resource Utilization

Monitor system health with metrics like CPU usage, resident memory size (RSS), uptime, and request counts. The dashboard also shows the request average duration and scrape duration for FastAPI endpoints and the OpenTelemetry Collector. Track request count, duration, and performance metrics for APIs like /api/events and /api/tags_info. The visualizations help pinpoint latency issues and optimize API efficiency.

Comprehensive Metrics for FastAPI

2. API Performance and Logs Overview

This Grafana dashboard visualizes API performance with metrics like request duration for various endpoints. Logs are categorized by type and level (info, error) to ensure quick identification of issues.

API Performance and Logs Overview

3. Trace Analysis and Log Rates

Detailed trace analysis highlights how requests propagate through the system. The log panel provides detailed insights into the ROS Logger API, displaying trace IDs and other contextual information for effective debugging. The dashboard provides a consolidated view of trace spans and log rates.

Trace Analysis and Log Rates

Benefits Observed

With OpenTelemetry and Grafana, here are the key improvements in monitoring my FastAPI applications:

  1. End-to-End Visibility: Monitor API calls, latency, and trace execution paths across services.
  2. Better Log Insights: Centralized and structured logging simplifies issue diagnosis.
  3. Improved Performance Monitoring: Track CPU usage, memory consumption, and request duration to maintain application health.
  4. Rich Visualizations: Grafana dashboards provide intuitive visual representations of telemetry data.


Conclusion

OpenTelemetry and Grafana have revolutionized the way developers approach observability. By integrating these tools with your FastAPI applications or web applications, you can reduce debugging time, gain better insights into performance, and create a more robust monitoring ecosystem.

If you're exploring observability tools, I highly recommend giving OpenTelemetry and Grafana a try. Feel free to reach out or share your thoughts in the comments!


Lisa Jung

Staff Developer Advocate at Grafana | OTel End User SIG & Communications SIG member

1 个月

Hey Sakshee Singh! Thank you so much for such a fantastic article. I think this would be a perfect topic for GrafanaCon and would love for you to submit a CFP if you are interested in speaking at our event. I just sent you an email with more details!

Arima Ayanambakkam

Sr. Engineering Manager at Nabors Industries

1 个月

Great article Sakshee Singh !

要查看或添加评论,请登录

Sakshee Singh的更多文章

  • Data Scientist Intern at Nabors Industries

    Data Scientist Intern at Nabors Industries

    For my summer 2022 semester, I continued working as a Data Scientist intern in the Environment Social Governance team…

    5 条评论
  • Spring 2022 internship experience.

    Spring 2022 internship experience.

    For my Spring 2022 semester, I earned the opportunity to work at Nabors Industries as a Data Science intern. Nabors…

    2 条评论

社区洞察

其他会员也浏览了