登录查看更多内容

DS Fortune Cookies: LangChain, Agents, and Authentication

Scott McKean

Specialist Solution Architect - Data Science

发布日期: 2024年12月24日

“Embrace LangChain's evolution and your spirit will be unbreakable, unlike your code.”

This fortune cookie clarifies some things around LangChain, agents, and authentication. LangChain is a deployment framework for machine learning models. Databricks still leverages LangChain a lot and the ecosystem is mutating quickly. This post clarifies the Databricks LangChain packages and explains how authentication works.

Databricks & LangChain

LangChain is amazing - they have a small team and have revolutionized the use of language models. This evolution comes with a price though, and if you are using LangChain with Databricks, there are three key things to remember:

Develop for frequent breaking changes!
Use the MLFlow LangChain flavour.
Use the databricks-langchain package.

The langchain-community package is a generic package for integrating providers and components. It used to have a ChatDatabricks interface, along with other providers. But as you can imagine, trying to manage pull requests from hundreds of providers isn’t tenable, so the interfaces in langchain-community (e.g. from langchain_community.embeddings import DatabricksEmbeddings)are now deprecated along with the langchain-databricks package (from langchain_databricks import ChatDatabricks). The pattern to use moving forward is to leverage the databricks-langchain package. This package has several key components:

The ChatDatabricks class: an OpenAI-compatible text-only chat interface.
The DatabricksEmbeddings class:? an OpenAI-compatible embedding interface.
The DatabricksVectorSearch class: a tool-compliant interface to Databricks vector search.

Compatibility manifests in several ways, for example to tell MLFlow what type of response you are expecting, or to allow tool-use within LangChain agents. It is also important in agent signatures.

领英推荐

The Evolving LLM Landscape: 8 Key Trends to Watch

Open Data Science Conference (ODSC) 5 个月前

July Junk Drawer of Data

Lori MacVittie 7 个月前

2024 Reflections: Evolving landscape of Enterprise…

Shailendra M. 2 个月前

***It is worth noting here that most LLMs use a completions or a chat interface, but the world has pretty much settled on a user, system, and assistant chat completion so it is worth building around that, even if just doing a simple query-response framework.

Authentication

Several customers were having issues authenticating to the vector store or tools. Let’s break this down. First - there are three ways to authenticate in Databricks - OAuth machine to machine (M2M), OAuth user to machine (U2M), and personal access tokens (PAT).

When talking about model serving, there are two ways to authenticate from a Databricks created serving endpoint to resources (e.g. a vector store): automatic passthrough and manual.?

Let’s start with manual authentication. It leverages secrets-based environment variables - this can be a PAT or M2M authentication. To use it, you pass a service principal (SP) and SP token into the objects making the calls (LangChain or the Databricks SDK). Both the SP identity and token should be stored as secrets and passed programmatically. This can be a pain, and Databricks recently shipped automatic authentication passthrough for dependent resources. This is beautiful and under the hood generated short lived M2M service principals and authentication that ‘just work’. I’d recommend trying these for all your model serving resource needs!

Scott McKean的更多文章

Databricks Logging and Debugging

2025年3月2日

Databricks Logging and Debugging

Let’s talk about logging on Databricks, specifically in Notebooks, Spark, and Ray. Effective logging is critical for…

4 条评论
DS Fortune Cookies: FTI Architecture

2025年1月13日

DS Fortune Cookies: FTI Architecture

Three sisters dancing in endless flow, feature, train, and infer they go! I read the LLM Engineer's Handbook over the…
Azure Databricks CI/CD

2024年12月31日

Azure Databricks CI/CD

This is an opinionated article on continuous integration and continuous delivery (CI/CD). These are specific practices…

5 条评论
An Opinionated Primer on Fine-Tuning

2024年12月2日

An Opinionated Primer on Fine-Tuning

Databricks Week 18 I'll admit that when I first heard about 'small language models', I thought it was a ridiculous fad.…

4 条评论
DS Fortune Cookies: System Prompts

2024年11月29日

DS Fortune Cookies: System Prompts

"Lucky numbers: 0, 1. Lucky words: Your system prompt.

2 条评论
Text Similarity

2024年11月14日

Text Similarity

Databricks Week 16 This week I had the pleasure of speaking with a couple of customers that want to compare two bits of…

1 条评论
100 Days at Databricks

2024年11月9日

100 Days at Databricks

As I hit the 100-day mark at Databricks, I want to review the journey so far with some of the bigger themes that stood…

6 条评论
Anomaly Detection

2024年10月30日

Anomaly Detection

Databricks Week 12/13 I was asked to help a customer out with anomaly detection. I brushed off some of the thoughts I…

4 条评论
Forecasting Deep Dive

2024年10月15日

Forecasting Deep Dive

Databricks Week 10/11 Today is the day - I’m going to really let myself talk nerd. Let’s dive into time series…

2 条评论
DS Fortune Cookies: Liquid AI

2024年10月2日

DS Fortune Cookies: Liquid AI

"When time is of the essence, closed-form solutions make all the difference." Liquid AI introduced a novel class of…

1 条评论

See all articles

DS Fortune Cookies: LangChain, Agents, and Authentication

Scott McKean

Specialist Solution Architect - Data Science

Databricks & LangChain

领英推荐

Authentication

Other Links

Scott McKean的更多文章

社区洞察

其他会员也浏览了

LLM Data Gateways: Bridging the Gap Between Raw Data and Enterprise-Ready AI

Web 3.0 Meets Web3: Bridging Interoperability with Decentralization

Beginner’s Guide to Algorithms

February 11, 2024

Architecture Weekly #129 - 29th May 2023

From Phone Phreaking to AI Manipulation: The Persistent Challenge of Command-Content Separation in Tech Security

LLM-based Tools for Data-Driven Applications - A Practical Overview (Part 1)

Strategies to Enhance Accuracy and Performance in LLM for Your Private Data

Strategies to Enhance Accuracy and Performance in LLM for Your Private Data

My Experiment with Neo4j and the Power of Graph-Based RAG

Databricks & LangChain

领英推荐

Authentication

Other Links

Scott McKean的更多文章

Databricks Logging and Debugging

DS Fortune Cookies: FTI Architecture

Azure Databricks CI/CD

An Opinionated Primer on Fine-Tuning

DS Fortune Cookies: System Prompts

Text Similarity

100 Days at Databricks

Anomaly Detection

Forecasting Deep Dive

DS Fortune Cookies: Liquid AI

社区洞察

其他会员也浏览了

LLM Data Gateways: Bridging the Gap Between Raw Data and Enterprise-Ready AI

Web 3.0 Meets Web3: Bridging Interoperability with Decentralization

Beginner’s Guide to Algorithms

February 11, 2024

Architecture Weekly #129 - 29th May 2023

From Phone Phreaking to AI Manipulation: The Persistent Challenge of Command-Content Separation in Tech Security

LLM-based Tools for Data-Driven Applications - A Practical Overview (Part 1)

Strategies to Enhance Accuracy and Performance in LLM for Your Private Data

Strategies to Enhance Accuracy and Performance in LLM for Your Private Data

My Experiment with Neo4j and the Power of Graph-Based RAG