LangChain - Question Answering using Vector Databases and Similarity Search, Evaluation, Agent
Sarthak Pattnaik
Senior Software Engineer at HCLTech | MS Applied Data Analytics at Boston University
The most interesting utility of LangChain is that once it is integrated with a large language model, one could use it to extract insights from data on which it isn’t trained. Such use cases bolster the argument in favour of LangChain and provides evidence of its overwhelming utility. LangChain was found in October 2022 by Harrison Chase and in only a year it has garnered immense popularity and is ubiquitously used LLM application framework as of today.
Questions and Answers using LangChain
Combining the capabilities of LLMs with documents that contain either personal or proprietary information can be of immense help and one would be able to collate answers to questions relevant to the information enmeshed in the document. However, the issue we face is that LLMs do not have gargantuan processing power in and of themselves. Therefore, to make sure that LLMs have the capacity to process large documents we use vector embeddings and storage. Vector embeddings are used to convert the contents of a document into a format which the language model would be able to comprehend. Similarity between these vectors determines how alike two distinct content pieces are. Vector databases are repositories that consist of vector embeddings extracted from documents. Since the size of these datasets are enormous, they are split in chunks and converted to embeddings, post which they are stored in vector databases. There are myriad ways to split the document and based on our use case we must use the appropriate splitting method. A few commonly used splitting techniques include RecursiveSplitting (splitting based on characters), Token splitting (splitting based on token count), context-aware splitting (splitting method that keeps similar words or sentences together). Now, with this mechanism in place, when one wants an answer to a question, they can curate the embeddings of the question and calculate ‘n’ similar vectors from the vector store. Once we have our ‘n’ similar vectors and the embedding of the question vector, we can pass this information to an LLM to orchestrate an output. When the question is converted to embeddings and passed to the vector store, based on the similarity ‘k’ documents from the store are picked and passed to a system prompt. Concomitantly, the system prompt along with the question is passed to the Large Language Model. In default scenarios, all the segments are passed in the same context window. When the length of the document is large however, we can leverage techniques like MapReduce, MapRank, and Refine.
Map Reduce is used to process multiple chunks in parallel and the final answer is curated by combining the result from the LLMs. It takes lot of calls and it treats each document independently which may not be the scenario all the time.
Refine builds upon the answers from the previous document and follows an iterative approach of information retrieval rather than a parallel approach as seen in Map Reduce.
Map_Rerank returns the rank value from parallel document processing in the LLMs and the highest score is returned as a result. We still use a considerable number of calls in this scenario.
领英推è
LangChain Evaluation
One of the most trivial ways to evaluate the performance of LangChain is to probe the result it generates and whether it is consistent with the details present in the original dataset. However, if we have a plethora of documents then writing query-result pairs for each of those documents is an assiduous task. Therefore, we have QAGenerateChain. QAGenerateChain curates a question-answer pair for each of the documents so that we do not have to create them ourselves. Once the QAGenerateChain has created multiple query-result pairs for each document, we can run these on our LLM and observe whether the output for each pair is consistent with the document. This again becomes a tedious task and therefore to ameliorate the painstaking endeavour of manually enumerating over each LLM response we use the model at our disposition to perform the evaluation for us. To evaluate the performance of LangChain model, we incorporate the QAEvalChain functionality is conjunction with the large language model.
?
LangChain Agents
An underdiscussed functionality of LLMs involves their utility in reasoning. An agent in LangChain provides the necessary features to integrate search engines like Wikipedia and DuckDuckGo into its framework so that the model can peruse the content in these websites to find relevant information pertaining to the question that is posited by the user. LangChain also allows users to create their own user-defined agent.
References