DS Fortune Cookies: LangChain, Agents, and Authentication

DS Fortune Cookies: LangChain, Agents, and Authentication

“Embrace LangChain's evolution and your spirit will be unbreakable, unlike your code.”

This fortune cookie clarifies some things around LangChain, agents, and authentication. LangChain is a deployment framework for machine learning models. Databricks still leverages LangChain a lot and the ecosystem is mutating quickly. This post clarifies the Databricks LangChain packages and explains how authentication works.

Databricks & LangChain

LangChain is amazing - they have a small team and have revolutionized the use of language models. This evolution comes with a price though, and if you are using LangChain with Databricks, there are three key things to remember:

  1. Develop for frequent breaking changes!
  2. Use the MLFlow LangChain flavour.
  3. Use the databricks-langchain package.

The langchain-community package is a generic package for integrating providers and components. It used to have a ChatDatabricks interface, along with other providers. But as you can imagine, trying to manage pull requests from hundreds of providers isn’t tenable, so the interfaces in langchain-community (e.g. from langchain_community.embeddings import DatabricksEmbeddings)are now deprecated along with the langchain-databricks package (from langchain_databricks import ChatDatabricks). The pattern to use moving forward is to leverage the databricks-langchain package. This package has several key components:

Compatibility manifests in several ways, for example to tell MLFlow what type of response you are expecting, or to allow tool-use within LangChain agents. It is also important in agent signatures.

***It is worth noting here that most LLMs use a completions or a chat interface, but the world has pretty much settled on a user, system, and assistant chat completion so it is worth building around that, even if just doing a simple query-response framework.

Authentication

Several customers were having issues authenticating to the vector store or tools. Let’s break this down. First - there are three ways to authenticate in Databricks - OAuth machine to machine (M2M), OAuth user to machine (U2M), and personal access tokens (PAT).

When talking about model serving, there are two ways to authenticate from a Databricks created serving endpoint to resources (e.g. a vector store): automatic passthrough and manual.?

Let’s start with manual authentication. It leverages secrets-based environment variables - this can be a PAT or M2M authentication. To use it, you pass a service principal (SP) and SP token into the objects making the calls (LangChain or the Databricks SDK). Both the SP identity and token should be stored as secrets and passed programmatically. This can be a pain, and Databricks recently shipped automatic authentication passthrough for dependent resources. This is beautiful and under the hood generated short lived M2M service principals and authentication that ‘just work’. I’d recommend trying these for all your model serving resource needs!

Other Links

Vector Search Best Practices

Databricks Platform Security Recommendations (TLDR: Use OAuth Tokens)

Robert Altmiller

Databricks RSA | Machine Learning, Big Data, Azure / AWS Expert

3 个月

Very helpful! ??

回复
James Marlowe

Democratize data and AI, helping data teams solve the world's toughest problems

3 个月

Always worth the wait, thank you for sharing!

回复

要查看或添加评论,请登录

Scott McKean的更多文章

  • Databricks Logging and Debugging

    Databricks Logging and Debugging

    Let’s talk about logging on Databricks, specifically in Notebooks, Spark, and Ray. Effective logging is critical for…

    4 条评论
  • DS Fortune Cookies: FTI Architecture

    DS Fortune Cookies: FTI Architecture

    Three sisters dancing in endless flow, feature, train, and infer they go! I read the LLM Engineer's Handbook over the…

  • Azure Databricks CI/CD

    Azure Databricks CI/CD

    This is an opinionated article on continuous integration and continuous delivery (CI/CD). These are specific practices…

    5 条评论
  • An Opinionated Primer on Fine-Tuning

    An Opinionated Primer on Fine-Tuning

    Databricks Week 18 I'll admit that when I first heard about 'small language models', I thought it was a ridiculous fad.…

    4 条评论
  • DS Fortune Cookies: System Prompts

    DS Fortune Cookies: System Prompts

    "Lucky numbers: 0, 1. Lucky words: Your system prompt.

    2 条评论
  • Text Similarity

    Text Similarity

    Databricks Week 16 This week I had the pleasure of speaking with a couple of customers that want to compare two bits of…

    1 条评论
  • 100 Days at Databricks

    100 Days at Databricks

    As I hit the 100-day mark at Databricks, I want to review the journey so far with some of the bigger themes that stood…

    6 条评论
  • Anomaly Detection

    Anomaly Detection

    Databricks Week 12/13 I was asked to help a customer out with anomaly detection. I brushed off some of the thoughts I…

    4 条评论
  • Forecasting Deep Dive

    Forecasting Deep Dive

    Databricks Week 10/11 Today is the day - I’m going to really let myself talk nerd. Let’s dive into time series…

    2 条评论
  • DS Fortune Cookies: Liquid AI

    DS Fortune Cookies: Liquid AI

    "When time is of the essence, closed-form solutions make all the difference." Liquid AI introduced a novel class of…

    1 条评论

社区洞察

其他会员也浏览了