Why you need OpenTelemetry based Observability for your AI apps
LLM Observability x OpenTelemetry = Langtrace

Why you need OpenTelemetry based Observability for your AI apps

Introduction

With the advent of LLMs, modern software development is going through an important shift - from mostly deterministic systems that can be reasoned through logic to non-deterministic inference endpoints. While LLMs are enabling developers to build innovative applications that were not possible before, they are also creating some interesting challenges when it comes to software quality, testing and debugging.

New Challenges

How do you get an LLM to respond only JSON?

One of the biggest challenges of LLMs is it's non deterministic nature. In simple terms, the LLM's behaviour is governed by uncontrollable variables like model weights, training data etc. which does not guarantee predictability of the response. A traditional GET API endpoint reading a database and returning a JSON response is guaranteed to return a JSON response every single time or fail. But, asking an LLM to return a structured JSON response is not guaranteed and identifying/understanding why it failed the one time during the day is close to impossible as it depends on uncontrollable variables.

This has big consequences to software testing, debugging and quality. When you cannot deterministically reproduce a bug, how can you fix it? One of the ways engineering teams are approaching solving this problem is by putting effective guard rails by using a combination model iteration, fine tuning and prompt engineering to control the behaviour of these systems. In reality, this will be an ongoing process of continuous improvement to get to the dream land of 100% model accurately and quality. This means defining metrics, having tools to gain visibility and diligently measuring and tracking the defined metrics will be of utmost importance to get the most out of these systems.

OpenTelemetry - an important standard

In order to effectively measure these new metrics of applications built using LLMs, we need the ability to generate and capture telemetry data from the LLM interaction layer which is typically made up of frameworks like Langchain and LlamaIndex, vectorDBs like Pinecone and PgVector and LLMs like OpenAI, Cohere and Anthropic.

Luckily we do not need to re-invent the wheel here as significant strides have been made in the last few years in pushing a standard data model for telemetry spans and traces primarily driven by OpenTelemetry, a CNCF incubated project that has wide adoption across the industry.

OpenTelemetry defines a standard data model for spans and traces, the building blocks of any observability system which guarantees minimal intrusion while maximizing high cardinality for effective debugging, visibility and incident response. OpenTelemetry also prevents vendor lock in by letting teams switch between different observability tools as the data model is fixed and widely adopted by various libraries, frameworks and the industry at large.

Langtrace SDK - the first step

An example span generated by Langtrace's Python SDK

The first step towards this goal is to equip teams with open telemetry(o11y) tracing capabilities for the LLM software layer. With Langtrace SDK, we are building open source SDKs for popular languages in order to capture o11y standard traces from LLM frameworks, vectorDBs and LLM providers. These SDKs are fully compatible with any of the available o11y exporters that can send these traces to any storage system or observability backend.

Langtrace Cloud - the observability layer you need

Langtrace Cloud - Traces view

New metrics demands the need for new capabilities on the observability client. This includes but not limited to:

  • Running tests and manual/automated evaluations.
  • Upload reference datasets and download captured, and annotated datasets.
  • Prompt management and a versioning.
  • Token usage tracking.

With Langtrace Cloud, we are building a light weight client that is hyper optimized to solve for the above needs while serving as an additional observability layer beside your existing observability solution.

Conclusion

LLMs are a fascinating technology that is supercharging a leap in computing. It is important to pick the right set of tools early in the adoption journey to gain control, visibility and confidence to build and ship high quality software with new and innovative capabilities.

Links

[1] https://langtrace.ai/

[2] https://docs.langtrace.ai/introduction

[3] https://github.com/Scale3-Labs/langtrace

要查看或添加评论,请登录

Karthik Kalyanaraman的更多文章

  • Evaluate CrewAI Agents using Langtrace

    Evaluate CrewAI Agents using Langtrace

    With the latest update, you can now evaluate both entire agent sessions and individual operations across your entire…

  • Attribute Extraction from Images using DSPy

    Attribute Extraction from Images using DSPy

    Introduction DSPy recently added support for VLMs in beta. A quick thread on attributes extraction from images using…

    1 条评论
  • Automatic Prompt Generation using DSPy

    Automatic Prompt Generation using DSPy

    Introduction In this post, I will show you a simple implementation of "automatic prompt generation" for solving math…

    1 条评论
  • OpenAI launches Evals, Tracing & Fine tuning

    OpenAI launches Evals, Tracing & Fine tuning

    Introduction ?? OpenAI makes an official entry into application specific Evals and LLM Ops tooling. OpenAI announced…

  • Building Compound AI systems

    Building Compound AI systems

    Introduction In this article, I will explain how I think about building and optimizing compound AI pipelines with…

    1 条评论
  • Evaluating the AI Oracle approach

    Evaluating the AI Oracle approach

    Recently, I came across this tweet about the AI Oracle approach for improving the accuracy and quality of responses for…

  • The year of Podcasts and Audio as UX vs the rest

    The year of Podcasts and Audio as UX vs the rest

    In 2019, I think the biggest threat to social media is going to be digital audio streaming (Podcasts and digital music…

    2 条评论
  • LinkedIn’s nifty new feature?—?Career Advice

    LinkedIn’s nifty new feature?—?Career Advice

    Recently, I was dabbling with LinkedIn’s features when I came across this new spot on my profile’s “Your Dashboard”…

    5 条评论

社区洞察

其他会员也浏览了