LangChain – Essential Concepts – my notes
Ajay Taneja
Senior Data Engineer | Generative AI Engineer at Jaguar Land Rover | Ex - Rolls-Royce | Data Engineering, Data Science, Finite Element Methods Development, Stress Analysis, Fatigue and Fracture Mechanics
1.???Introduction
?
I have been writing a series of blogs on the working of the Large Language Models, ChatGPT training, working of Transformers and all my notes can be found in my artcles below – all available on LinkedIn:
?
This blog it about LangChain which is a framework to develop applications powered by Large Language Models. This article is organized as follows:
- In the section 2, I formally define the problem – highlighting why LangChain is important, the section goes into the background of LangChain.
- Section 3 goes into detail of the essential components that make up LangChain.
- Section 4 talks of Applications/use cases of LangChain
- Section 5 goes into the use case involving Question-Answering over documents using LangChain and provides code snippets.
- Section 6 talks of the concept of Agents in LangChain
- Section 7 discusses 2 very interesting research papers on: Chain-of-Thought processing and ReAct: Synergizing Reasoning and Acting in Language Models
- Section 8 summarizes the discussions in section 7 and 8
- Section 9 is on notebook examples on Agents.
- Section 10 points to my public Github repository with the of notebooks referenced in the above sections.
2.???Problem Definition
?
Before we start talking about LangChain, let us first define the problem we’re solving.
?
As an end user of a Large Language Model such as ChatGPT, we normally access through our browser or through the API. However, using the API or by connecting through your browser, you're connected to only the training data which the LLM has been trained on - that is:
- you’re not connected to the world outide of the training data,?
- you’re not connected to your own data i.e., your own documents – you cannot make ChatGPT answer questions from your data or your personal documents or any business documents,
- You cannot ask questions relating to scientific calculations.
- Besides, if we talk about ChatGPT, its training data comprises of data up-to September 2021, it has no knowledge of the world beyond September 2021.
?That is where LangChain comes in.LangChain (which you can install using simple pip install) connects your AI model (ChatGPT or HuggingFace or cohere, etc.) with the outside sources and you can extract information from the outside sources just like you would do through ChatGPT using your browser.
Formal definition of LangChain:
LangChain is a framework for developing applications powered by language models.
LangChain makes the complicated parts of working and building with AI models easier. It helps to do this in two ways:
- Integration: Bring external data such as files other applications, etc to your LLM.
- Agents: which allow LLMs to interact with its environment via reasoning and acting (ReAct). Here, we use LLMs to decide what action to take next (to ‘reason’).
?
Background of LangChain:
LangChain was launched in October 2022 as an open-source project by Harrison Chase, whilst working at a machine learning startup Robust Intelligence. The project quickly garnered popularity, with improvements from hundreds of contributors on GitHub, discussions on Twitter, many YouTube tutorials, medium blogs and meetups in San Francisco and London./ In April 2023, the new startup raised over $20million in funding and more is still coming [https://en.wikipedia.org/wiki/LangChain]
More and more integrations are being introduced through LangChain and is extremely popular.?
Why LangChain?
1.??????Components
?LangChain comprises of several tools that make it very easy to work with Lage Language models – the LLM could be ChatGPT, Hugging Face LLM, etc. These components include:?
These components are discussed in section?3.
?
2.?Customized Chains:
LangChain provides out-of-the-box support for using “customized chains†– a series of actions strung together. See section 3.2
3.?Speed:
On the qualitative side, the reason LangChain has gained so much popularity is because of its speed – new features, integrations are being added daily and its important you have the latest branch.
?
4.?Strong community support:
Meet ups, webinars, YouTube, medium blogs.
3.?Components of LangChain
?
Let us have an introductory understanding of the components of LangChain as mentioned above.?The components of LangChain include the following:
- Model
- Prompts
- Document Loaders and their utilities
- Memory
- Chains
- Agents
?
Let us discuss the above in a little detail:
3.1??Document Loaders and Utilities:
?
Document loaders are used to load data from a source as Document – which in LangChain’s language is a piece of text and associated metadata. LangChain supports following document loaders:
- HTML
- CSV
- Markdown
- JSON
Along with the document loaders, come the following utilities in LangChain:
Text splitters:
Many times, the document might be too long (like a book) for the LLM- it is then required to split the document into chunks. Text splitters help with this. Most common type of text splitters are RecursiveCharacterTextSplitters- there are many different text splitters depending upon the use case and the documentation is available here: ?
?
Important parameters to know here are?chunkSize?and?chunkOverlap.?chunkSize?controls the max size (in terms of number of characters) of the final documents.?
chunkOverlap?specifies how much overlap there should be between chunks. This is often helpful to make sure that the text isn't split weirdly.
?
Retrievers:
It should be emphasized at this points that documents will be converted in chunks and each chunk into embeddings and stored in a vector store. Retrievers, take the prompt , convert it into embeddings and
Retrievers combine the documents with the Language Models. There are many different types of retrievers but the most common one is the vector store retriever. Its most widely supported and aids in the similarity search with the embeddings.
Retrievers help to get the relevant documents to the prompt, convert them into embeddings and then goes to the vector store to find similar documents and then conveys the answer using the prompt and the context provided through the retrievers.
More is discussed in the section 5.
?
Vector Store:
We briefly talked about the vector store above. Vector Stores are databases to store vectors. Most popular ones are:
·????????Pinecone
·????????Weaveite
·????????FAISS
?
Vector Store can be thought of as a table with embeddings and meta data – they will store the embeddings along with the associated metadata and make them easily searchable.
Example:
OpenAI retriever documentation contains details of the Vector Databases here
3.2???Chains
Chains help in combining different LLM calls and actions automatically. That is: you give one prompt to the language model and the output of that prompt you want to use it as an input to another call and so on.
Some of the Chains in LangChain include:
- Simple Sequential Chain
- Summarization Chain
?
Here is an example of Simple Sequential Chain:
Summarization Chain:
Easily run through long numerous documents and get a summary.
?
3.3??Memory
Memory could be thought of as remembering information one chatted about in the past. It is often used for building chatbots.?Here is an example of how memory/chat history can be used:
Example:
4.?Applications / Use cases and Integrations of LLMs with LangChain
?
The applications/use cases of such a tool are endless. These could be:
1.??????Question-answering – Question-answering over your document data
2.??????Tabular Question and answering – Lots of data is stored in tabular data, whether its csvs, excel sheets or SQL tables.
3.??????Summarization of the documents
4.??????Building your own chatbots powered by an LLM.
5.??????Connecting to internet / search / scientific calculations
?
?
Integrations: LLMs [https://js.langchain.com/docs/modules/models/llms/integrations#replicate]
LangChain offers a number of LLM integrations. These being:
- OpenAI
- Azure OpenAI
- Cohere
- HuggingFace LLMs
- ????..........
?
5.???Question Answering Over Documents
?
The Question-Answering Over your own documents has been discussed through following user guides of LangChain:
?
?
Following are some of the methods using which one can carry out question answering over your documents:
a.??????Using load_qa_chain
The load_qa_chain loads a chain that you can do question and answering over your documents. It uses ALL of the text in the documents.
领英推è
?
The process that is happening is illustrated below:
The load_qa_chain actually wraps the entire prompt in some text instructing the LLM to use the information from the provided context. The prompt being sent to OpenAI looks something like this.
{context} //pdf of the document
Question: {query} - the actual query
?
This method is good when we have only a short amount of information to send in the context. Most LLMs will have a limit on the amount of information that can be sent in a single request.
b.?Using embeddings
Next, what we do is:
- Convert the pdf to Document object of LangChain.
- Split the document into chunks.
- Convert to embeddings.
- Store it in a vector database.
- And then use the Retriever that converts the prompt into an embedding and then uses similarity search to retrieve the document (from all the chunks) closest to the prompt from the vector store and then as uses that chunk as context for the LLM to answer the question corresponding to the prompt.
This is demonstrated through this notebook:
6.?Agents in LangChain and the ReAct framework
?
Let us try and understand what “Agents†are in Language – as described in the introductory section, LangChain sits in the middle between the LLM and the external tools. Agents can be thought of “bots†which take action on behalf of yourself. They are going to chain together different actions in LangChain.
Formal definition of Agents:
Agents use an LLM to determine which action to take and in what order. An action can be either: using a tool and observing its output or returning it to the user directly – if the observed output is the answer that’s the LLM thinks is correct.
Parameters when creating an Agent:
Following parameters are required when creating an Agent:
- Tool: A tool is a function that performs a particular duty. This can be Google search, Database Lookup, other chains. The interface for a tool is currently a function that is expected to have a string as an input with a string as an output.
?
- LLM: The language model powering the agent.
?
- Agent: The agent to use. This should be a string that references a support agent class.
?
List of tools supported in LangChain:
i)??Serpapi [Complete]: A search engine. Useful when you need to answer questions about current events. Input should be a search query.
ii)???Wolfram-alpha: A wolfram-alpha search engine. Useful for scientific calculations
iii)?Llm-math: Useful when you want to find answer questions about mathematics.
iv)?Google-search: A wrapper around google search. Useful when you want to answer questions about current events. Input should be a search query.
v)?news-api: Use this when you want to get information about the top headlines of the current news stories.
vi)????Etc.
?
Before going into the notebook examples, relating to Agents, it might be worthwhile to discuss some aspect of the theory behind:
- Chain-Of-Thought Prompting
- ReAct Prompting
?
This will help understanding the working of Agents better.
7.?Chain-of-Thought Prompting and ReAct: Reasoning and Acting
7.1?Chain-of-Thought Prompting
?
Now, as an alternative to fine-tuning we also have new options which an save a lot of money and time for fine tuning based on your company data and domain specific data and use something simpler and more intelligent which is called Chain-of-Thought Prompting and ReAct. Let us see the different between the Chain-of-Thought Prompting and Standard Prompting using the example below. This is taken from the paper published by Google Brain during January 2023 titled:
“Chain-of-Thought Prompting Elicits Reasoning in Large Language Modelsâ€
The difference between the 2 is that in the first case of Standard Prompting, you gave just the answer and asked the next question – and the next question was on the same lines as the first question. The model failed to answer correctly.
In the second case, we have provided the reasoning as an input and then asked the second question. Thus, in this case, the model has been provided with the chain-of-thought to evaluate the question. And that is the beauty of auto-regressive systems – to be able to predict the future values based on the past values. Let us look at the improvement in the model performance for the PaLM (Pathways Language model [https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html] which is a 540B parameter model:
?
From the above figure, it is clear that large language models offer the exciting prospect of in-context few-shot learning via prompting. That is, instead of fine tuning a separate language model checkpoint for each new task, one can simply prompt the model with a few input-output exemplars demonstrating the task. Remarkably, this has been successful for a range of simple question-answering tasks.
?
?
?
7.2? ReAct: Synergizing Reasoning and Acting in Language Models
?
This paper [https://arxiv.org/abs/2210.03629] published during March 2023 attempted to study the synergy between reasoning (chain-of-thought prompting) and acting (e.g., action plan generation). The paper explored the use of LLMs to generate reasoning and task specific actions in an interactive manner. Allowing for greater synergy between the two: reasoning traces help the model induce, track and update action plans while actions it to interface and gather additional information from external sources such as knowledge base and environments.
ReAct outperforms reinforcement learning methods by an absolute success rate of 10~% while being prompted with one or 2 in-context examples.
?
The figure below shows the comparison of prompting methods:
a)??????Standard
b)?????Chain of Thought (CoT) – Reason only
c)??????1c Act-Only
d)?????ReAct (Reason + Act)
8.? Agents: Summary
?
With the above brief on Chain-of-Thought prompting and the ReAct framework, let us now throw some light on Agents in LangChain. Agents can be thought of as ‘bots’ which take action on your behalf, they are going to chain together different actions in LangChain.
Agents use the LLM to determine which action to take next and in what order. They use the reasoning power of the LLM. The action can be using a tool and return its output or return the output directly – without using the tool.
?
Points to be understood in relation to Agents in LangChain:
?
Following points must be underscored with regards to Agents in LangChain. Using Agents in LangChain:
a.?We make the LLM do the reasoning.
?
b.?It is able to do the reasoning based on the Chain-of-Thought processing (CoT) as explained above
?
c.?As mentioned above in section 4.1, by providing some contextual answers we’re able to make the LLM to do the reasoning.
?
d.?The Action is decided by the LLM and the action inputs are provided by the LLM to the tool
?
e.?The tool returns the observation.
?
f.?And the LLM decides the next course of action.
?
Using Agents is equivalent to giving an LLM a prompt like below:
- Answer the following questions to the best you can. You have access to the following tools:
Search: Use this to search the internet.
Calculator: Use this to do the math.
More Tools...
- Use the following format:
Question: the input question you must answer
Thought: You must always think of what to do.
Action: The action to take. It should be one of the [{tool_names}]
Action input: The input to the action
Observation: The result of the action
…(this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: The Final answer to the original question
9.?Agents – notebook examples
?
Search example?
?????II.? Example with different tools – including a LLM tool
?
It is also possible to have many different ‘tools’ to be part of the Agent’s definition – including a ‘tool’ which will use the LLM for your question. This might be so because one would expect some questions to be answered directly by the appropriate tool (e.g. a math or a search tool) and some directly by the language model based on its training data.
This is demonstrated through the notebook below:
10. GitHub public repo
I have pushed all the example that were discussed above in my GitHub repository here: