Langchain: A Primer
Lakshya Prakash Agarwal
Applied AI Engineer | Master's in Analytics @ McGill | ex-Bain & Co. | Databricks Certified Generative AI Engineer | Full-Stack | Machine Learning
We (as humans collectively) have done software for over 20 years now. There are a few established practices that are accepted as the norm. Large Language Models (or LLMs) provide a powerful new avenue to build the next generation of software (sometimes called Software 3.0). However, that does not mean we need to start from scratch. This post is meant to serve as a gentle introduction to LangChain and its components. But before we get there, we need to understand the need for LangChain in the first place.
I expect the reader to have some understanding of how LLM applications work. I will try and avoid jargon as much as possible. In brief,
> you provide a prompt to a model, and it generates a response.
That's the ground-level idea. Every word in the idea above can be tweaked and made more complex, but the core idea remains the same.
I really like this tweet from Hamel Husain that cuts through the overly technical jargon.
What is [LLM] orchestration?
From Databricks,
Orchestration is the coordination and management of multiple computer systems, applications and/or services, stringing together multiple tasks in order to execute a larger workflow or process. These processes can consist of multiple tasks that are automated and can involve multiple systems.
In simpler terms, an orchestration is like a conductor in an orchestra, with the orchestra being the various components of a system. In the context of LLMs, this becomes even more important because of a few reasons:
For a better understanding, look at this image from a16z:
It's a lot! The key takeaway is that there are multiple components with an orchestration block in the middle that connects everything.
But why can't I write an orchestration engine myself?
You absolutely can, and I recommend you do so for a toy project or a proof of concept. However, if your project needs to be reliable, scalable, and maintainable, then you need to provide parallelization, fallbacks, batch, stremaing, async, and more such concepts that Software 2.0 has perfected over the years.
This is where you add in tools such as LangChain or Microsoft Semantic Kernel.
What is LangChain?
Here, I'm going to give an overview of LangChain, primarily for 2 reasons:
As I explained above, LangChain is a framework to manage the different components of an LLM application. The entire stack, as shown on their website is:
Again, it's a lot, but the key takeaway is that the stack comprises interoperable and interconnected components:
In addition, LangChain also consists of 2 highly desirable add-ons, which I'm not going to cover in detail here:
What is LCEL?
In one line, LCEL is a way to declaratively combine LangChain components to create "chains". For example,
lcel_chain = prompt_template | openai_model | str_output_parser
This one line of chain will take a user input, pass it to the OpenAI model, and then parse the output into a string. This is a very simple example, but you can imagine how complex these chains can get. The chain can be used to stream tokens as they are generated, which means that your users can see the output in real-time, reducing the time-to-first-token.
The 3 methods in LangChain
Based on their website, LangChain has 3 main architectures to build applications that work across the stack: Retrieval, Agents, and Evaluation.
Retrieval
1. What is RAG?
I'm sure you've heard of the term "RAG" or "retrieval-augmented generation" thrown around multiple times. In a non-technical language, it is a way to add context to the prompt you provide to the model so that it generates a more relevant response. This context can be taken from a document, a website, or any other source of information.
Now, how does the model know which document to look at, and specifically which part of the document to look at? This is where we enter into the realm of embeddings, vectorstores, similarity scores, and more such concepts, which I'll cover in a later post.
领英推荐
Let me explain it with a diagram:
The process is broken down into 4 steps:
That's all there is to RAG. As before, all the 4 steps above can be tweaked and made more complex, leading to more powerful applications.
2. RAG in LangChain
LangChain offers an extensive library of tools and an intuitive framework to compose the components of a RAG system, mostly contributed by the community. Let's look at the same example from before, but using the stack:
Notice how the core idea is the same: I'm still ingesting and retrieving documents, augmenting the prompt, and generating a response. The difference is that now I'm using:
The benefit I want to highlight is that everything is connected through LCEL, which means that I can swap out any component with another one that adheres to the LCEL protocol, be it a different vectorstore, different model, different prompt template, different retrieval strategy, different document sources, etc. Having an extensible and interoperable stack unlocks a lot of possibilities.
Agents
So now we've made an application that can augment prompts with information from documents. But what if we want to provide this application with reasoning or planning capabilities, or give it the ability to call external functions like getting the weather or sending an email? This is where Agents come in.
From the LangChain docs:
Agents are systems that use LLMs as reasoning engines to determine which actions to take and the inputs to pass them. After executing actions, the results can be fed back into the LLM to determine whether more actions are needed, or whether it is okay to finish.
Essentially, agents are chains wrapped with conditional statements and for/while loops:
While the above isn't exactly true, it does have a grain of truth in it. The good news is that we already have an understanding of how we can create the "THEN" blocks. What's left is to understand how we can create the "IF" and "END" blocks.
Agents in LangChain
LangChain provides the functionality to build agents through LangGraph. At the risk of sounding like a broken record, LangGraph is a way to declaratively combine LangChain components and other functions to create "graphs". It achieves this by providing nodes and edges that can be connected and have a common, shared state.
A key point of difference is that LangGraph is not a Directed Acyclic Graph (DAG), meaning that you can have loops in your graph.
Let's look at an example application made with LangGraph:
I'll spare the technical details, which you can understand from this tutorial here. Here are the key takeaways:
I think LangGraph is extremely powerful, although a bit complicated for me to understand conditional edges and state management. I'll keep exploring this and provide more insights in a future post.
Evaluation
All of what I've described above is actually pretty useless if you don't know how well your application is performing. Moreover, LLMs are inherently non-deterministic, meaning that the same input will not always produce the same output. By application performance, I mean each and every component of the application, from the prompt templates to the retrieval strategies to the agent behavior.
Having a proper evaluation framework, such as Metrics-Driven Development, is the cornerstone of building reliable and scalable applications. This can be as simple as measuring the effectiveness of different prompts, or become more complex by measuring the impact of different retrieval strategies on the model's performance, latency, throughput, etc.
Evaluation with LangSmith
LangSmith is a one-stop observability platform to develop, collaborate, test, and monitor LLM applications. What's super cool about LangSmith is that it integrates with all LangChain components off-the-shelf, and can even be configured to work with non-LangChain components.
I won't go into the details of LangSmith here as I'm still learning it, but I will link this great tutorial series by Lance Martin on how to use LangSmith.
Conclusion
I hope this post has given you a good overview of LangChain and its components. I've tried to keep it as non-technical as possible! Many thanks to the folks over at AI Makerspace for providing a great refresher through their AI Engineering Bootcamp! I'll be writing more posts on how to use all these components to create powerful applications, so stay tuned!
I'm currently about to graduate from my Master of Management in Analytics program at McGill University, and am actively seeking full-time opportunities in data science and artificial intelligence. Please feel free to reach out via LinkedIn, X, or e-mail ([email protected]) if you have any questions or just want to chat!