Enhancing Infrastructure Finance Language Model With In-Context Learning: The DSP Framework
This LinkedIn series seeks to spark a conversation among fintech investors, technologists, and infrastructure finance professionals to gauge interest in building and training InfraOptimus, a large language model for infrastructure finance, to benefit sustainable global development. Our vision is to enable greater efficiency and accuracy in data processing and analysis, unlocking new possibilities for natural language processing in infrastructure finance. With your involvement, we can make InfraOptimus a game-changing tool that transforms the way we approach infrastructure finance.
Our article series presents an overview of the proposal to develop and train a large language model called InfraOptimus, specifically tailored for the infrastructure finance industry. The?first article?offers a high-level overview, while the?second article?introduces the theoretical InfraOptimus model and emphasizes its significance for the industry. In?the third article, we delve into how BloombergGPT impacts the development of InfraOptimus. The?fourth article?below connects the Demonstrate-Search-Predict (DSP) framework's expected influence on InfraOptimus's progress. The?fifth article?explores the anticipated use cases of InfraOptimus in addressing common challenges in infrastructure finance, providing hypothetical examples and real-world scenarios.
This article has the following to suit your needs:
EXECUTIVE SUMMARY
In-context learning will empower InfraOptimus, the proposed language model for infrastructure finance, to gain insights from past successes, adapt to diverse projects, and make informed decisions. It will leverage pretrained components, retrieving pertinent information from extensive datasets to remain up-to-date with industry trends. InfraOptimus will integrate existing knowledge, such as historical data and financial models, using retrieval augmentation. This will enhance the accuracy of its recommendations, aligning them with proven strategies. The Demonstrate-Search-Predict (DSP) framework, combined with in-context learning, will simplify development by allowing experts to express task-specific strategies as short programs, reducing overheads and facilitating rapid iterations.
The DSP framework, pioneered by computer scientists at Stanford University, presents a groundbreaking approach to AI system development. It combines in-context learning, retrieval augmentation, and composability to improve the construction of AI systems. In-context learning allows for complex systems built using natural language instructions, facilitating communication between pretrained components. Retrieval augmentation enhances the system's access to knowledge, improving performance in knowledge-intensive tasks. Composability enables flexible component combinations, promoting rapid prototyping and effective use of pretrained components.
The DSP framework democratizes AI system development by simplifying the process and enabling broader participation. It reduces the barriers to entry by leveraging pretrained models and natural language instructions. This allows a wider range of developers and domain experts to contribute to AI system creation. The framework maximizes the value of pretrained components by providing a structured integration approach, incorporating pretrained language models and retrieval models efficiently.
Implemented as a Python library, the DSP framework provides core data types and functions for building programs that facilitate interactions between the language model and the retrieval model. DSP introduces primitives like DEMONSTRATE, SEARCH, and PREDICT, defining transformations in the program. These transformations can invoke each other, pass text data, and do not involve backpropagation. Developers can define desired behaviors through demonstrations and annotations, guiding the language model in multi-hop tasks and enhancing its performance.
The evaluation section focuses on implementing and assessing DSP programs for three NLP tasks: open-domain QA, multi-hop QA, and conversational QA. DSP programs utilize pretrained modules such as ColBERTv2 and GPT-3.5. Various techniques, including greedy decoding and sampling, are employed for generating predictions. Evaluation consistently shows that DSP programs outperform baselines, achieving significant gains in accuracy. The coordination between the language model and retrieval model, along with unique structures and transformations in each task, demonstrates the effectiveness of DSP in NLP tasks.
Overall, the DSP framework offers a simpler, accessible approach to building complex AI systems. By leveraging in-context learning, retrieval augmentation, and composability, DSP simplifies AI system development, encourages broader participation, and achieves notable performance improvements. It will significantly impact InfraOptimus's development and architecture, providing a powerful tool for creating advanced AI systems through its user-friendly Python library and demonstrated success in various NLP tasks.
If you're interested in learning more about the potential use cases of InfraOptimus and the infrastructure finance DSP program, stay tuned for the next article. Please share your thoughts on this article below or in DM. Follow?Infrafintech Intelligence?on LinkedIn.
CONDENSED VERSION
The Demonstrate-Search-Predict (DSP) framework pioneered by computer scientists at Stanford University introduces a groundbreaking approach to AI system development. By combining in-context learning, retrieval augmentation, and composability, DSP significantly improves the way AI systems are built. In-context learning enables the construction of complex systems using natural language instructions, fostering communication between pretrained components. Retrieval augmentation integrates retrieval models, enabling the retrieval of relevant information from large datasets. Composability allows for the flexible combination of components to create rich and sophisticated pipelines. DSP democratizes AI system development, streamlines prototyping, and maximizes the value of specialized pretrained components, leading to significant advancements and state-of-the-art results in knowledge-intensive tasks.
In a DSP program, the language model (LM) and retrieval model (RM) are two crucial components. The frozen LM generates or scores text based on instructions and examples, playing a vital role in answering questions and generating queries. The RM retrieves relevant text sequences for queries, estimating their relevance or similarity. It searches pre-defined passages during the search stage and demonstrations, helping the LM adapt to the task. The frozen state means both models are pretrained and their parameters remain fixed during program execution. While frozen models provide a strong foundation, they lack adaptability to new domains, potentially resulting in suboptimal performance and outdated information. DSP programs leverage the communication between LM and RM to achieve accurate answers and adaptability within defined constraints.
In an infrastructure finance DSP program, InfraOptimus and an RM work together to provide investment recommendations for water treatment plants. The RM has an important role: retrieving nearest-neighbor demonstrations from the training data. These demonstrations represent successful investment recommendations in the same domain. During the prediction stage, InfraOptimus generates potential recommendations, while the RM ensures their alignment with past successes. It selects well-grounded sequences by comparing them to the retrieved demonstrations. By leveraging the RM's functions, the program improves the quality and reliability of its recommendations, drawing on proven strategies from similar contexts in infrastructure finance. This example showcases the value of incorporating the RM's additional capabilities in enhancing predictions for investment decisions.
The DSP framework, implemented in Python, provides core data types and functions for building programs that facilitate interactions between the LM and RM. Code snippets illustrate the framework's capabilities. DSP programs operate on Examples, which are like dictionaries containing fields for multi-hop questions and short answers. Training examples are used to learn question-answering strategies, even without explicit labeling of intermediate steps. A complete DSP program takes a question, creates an Example, and assigns the training set. DSP primitives, such as DEMONSTRATE, SEARCH, and PREDICT, define transformations in the program. Transformations can invoke each other, pass text data, and don't involve backpropagation. The program's transformations are categorized into stages: DEMONSTRATE, SEARCH, and PREDICT.
In the DSP framework, including examples of desired LM behavior improves performance. Demonstrations and prepared training examples illustrate specific behaviors expected from the LM. The DEMONSTRATE transformation generates demonstrations by selecting training examples and adding new fields. It programmatically adds annotations for intermediate transformations, guiding the LM in multi-hop tasks. The annotate primitive applies user-defined transformations, caching predictions as successful demonstrations for pipeline transformations. DEMONSTRATE enables exploring complex strategies without custom annotations for each step, allowing for building pipelines through the composition of small transformations. The framework offers primitives for selecting subsets of training examples, facilitating the development of larger strategies. Overall, DSP enables complex strategy creation without hand-labeling intermediate transformations, promoting modularity and flexibility in model development.
In infrastructure finance, the DSP framework can improve project feasibility analysis with InfraOptimus. Including examples of desired behavior in prompts enhances InfraOptimus's performance. During DEMONSTRATE, an Example representing a project scenario generates demonstrations, illustrating how InfraOptimus breaks down questions, gathers information, and analyzes feasibility. The annotate primitive refines training examples by caching predictions as successful demonstrations. DEMONSTRATE allows the exploration of complex strategies without annotating every step. DSP's primitives, like sample, knn, and crossval, select subsets of training examples. Combinations of these primitives facilitate larger strategies. DSP's modularity eliminates the need for manual labeling, empowering developers to adapt program strategies and improve efficiency in infrastructure finance analysis with InfraOptimus.
The SEARCH stage in the DSP framework collects relevant passages to support the LM's transformations. It leverages a large knowledge corpus divided into text passages, enabling factual responses, updatable knowledge, and transparency. Simple scenarios use direct retrieval from a model for top-k passages. Complex tasks require advanced SEARCH strategies for multi-hop reasoning and conversational challenges. DSP's automatic annotations from DEMONSTRATE enable the exploration of strategies by exchanging queries, passages, and demonstrations between the RM and LM. DSP offers modularity, automatic prompt updates, and better control flow compared to techniques like self-ask. It incorporates fusion techniques for improved recall and robustness and allows compositions and extensions for various NLP tasks, such as conversational multi-hop search or spell correction. Overall, DSP enriches the SEARCH stage with sophisticated strategies, fusion, and flexibility for enhanced NLP tasks.
In the context of a DSP program for infrastructure finance, the SEARCH stage plays a crucial role in gathering relevant passages from a large knowledge corpus, enabling the retrieval of specific information to support the transformations conducted by the LM. It allows for factual responses, facilitates knowledge updating without retraining, and ensures transparency in information sources. In simple scenarios, the SEARCH stage directly queries the RM for top-k passages that match a given question. For more complex tasks involving multi-hop reasoning or conversational challenges, advanced strategies are employed to retrieve passages that address the query in a multi-step manner. Fusion techniques improve recall and robustness by combining retrieval results, while compositions and extensions offer customization for specific NLP tasks. Overall, the SEARCH stage in the DSP framework enables effective information retrieval and transformation for infrastructure finance and other domains.
The PREDICT stage in DSP focuses on generating reliable system output by utilizing demonstrations, passages, and candidate predictions. The generate primitive leverages the LM to produce one or more predictions for the end-task, while the rank primitive uses the RM to determine relevance scores. DSP offers strategies for selecting and aggregating predictions, such as choosing the most popular prediction among candidates and allowing multiple pipelines of transformations. For a small number of demonstrations or passages, they can be concatenated into the prompt, while larger sets can be processed in parallel or sequentially with aggregation methods. DSP's aggregation strategies enhance the reliability of the PREDICT stage for generating candidate predictions in various scenarios.
The PREDICT stage in the DSP program for infrastructure finance can focus on determining the financial viability of a toll road project connecting a city to an airport. It utilizes the LM and RM to generate the system output by aggregating information from demonstrations, passages, and candidate predictions. The generate primitive produces multiple predictions by querying the LM based on a template and example, while the rank primitive assesses the relevance scores using the RM. Strategies for selecting predictions based on financial indicators, such as debt service coverage ratio?(DSCR), project life coverage ratio (PLCR), or loan life cover ratio (LLCR), are explored. Parallel processing and aggregation methods handle a large amount of data, enhancing the reliability of predictions for the toll road project's financial viability.
The evaluation section focuses on implementing and assessing DSP programs for three NLP tasks: open-domain QA, multi-hop QA, and conversational QA. The authors compared the effectiveness of DSP programs to other methods using specific datasets. DSP programs utilized pretrained modules, including ColBERTv2 as the retriever module (RM) and GPT-3.5 as the language model module (LM). Various techniques were employed for generating predictions. The evaluation results consistently showed that DSP programs outperformed the baselines across all tasks, achieving significant gains in accuracy. The coordination between the LM and RM, along with the unique structures and transformations in each task, demonstrated the effectiveness of DSP in these NLP tasks.
The paper introduces the DEMONSTRATE–SEARCH–PREDICT (DSP) framework, which facilitates retrieval augmented in-context learning and represents a departure from traditional AI model construction. The framework offers a simpler, more accessible approach to building complex AI systems through natural language instructions and pretrained models. It enables rapid prototyping, leverages specialized pretrained components, and promotes broader participation in system development. Implemented as a Python library, DSP has shown significant performance improvements in various tasks. Beyond performance, DSP's contribution lies in uncovering new conceptual possibilities for in-context learning, which will have a profound impact on InfraOptimus's development and architecture, revolutionizing AI system design.
If you're interested in learning more about the potential use cases of InfraOptimus and the infrastructure finance DSP program, stay tuned for the next article. Please share your thoughts on this article below or in DM. Follow?Infrafintech Intelligence?on LinkedIn.
DEEP DIVE
Infrastructure finance involves complex decision-making that necessitates a deep understanding of factors like project feasibility, financial viability, and risk assessment. In-context learning allows InfraOptimus, the proposed large language model for infrastructure finance, to learn from past successful investment recommendations and apply that knowledge to generate accurate insights for new scenarios. By leveraging in-context learning, InfraOptimus can grasp the context-specific nuances of infrastructure finance and make informed decisions.
Infrastructure projects can vary significantly in location, type, and financing structures. In-context learning enables InfraOptimus to quickly adapt and learn in new domains without extensive manual reprogramming or retraining. It leverages pretrained components and retrieves relevant information from large datasets to stay updated with industry trends, regulations, and best practices.
There is a wealth of existing knowledge in infrastructure finance, including historical project data, financial models, and industry reports. In-context learning integrates this knowledge into the AI system by leveraging retrieval augmentation. InfraOptimus can access and incorporate relevant information from large knowledge corpora, improving the accuracy and reliability of its recommendations. It learns from past successful investment strategies, aligning its decisions with proven approaches in the field.
The Demonstrate-Search-Predict (DSP) framework, combined with in-context learning, streamlines the prototyping and development process for the AI system. Machine learning experts and domain specialists can collaborate and build grounded AI systems at a high level of abstraction. Task-aware strategies can be expressed as short programs using composable operators, reducing InfraOptimus's development and deployment overheads and enabling faster iterations and improvements.
The traditional approach to building AI models has primarily focused on multiplying tensor representations, especially during the era of deep learning. Tensors are multi-dimensional arrays used in deep learning to represent data, such as images or text. This approach has resulted in modular designs that enable rapid development and exploration. However, these design paradigms require extensive domain expertise and even experts face challenges when integrating different pretrained components into larger systems.?
In-context learning presents a new opportunity by allowing the construction of complex systems solely based on natural language instructions, facilitating communication between pretrained components. Pretrained models act as building blocks, and natural language instructions and operations on text form the core operations. Realizing the potential of in-context learning will democratize AI system development, streamline prototyping for new domains, and maximize the value of specialized pretrained components.
Computer scientists from Stanford University have introduced the DSP framework to enhance in-context learning. It incorporates retrieval augmentation incorporating retrieval models, which retrieve relevant information from large datasets, into the in-context learning framework. DSP consists of simple and composable functions that enable the implementation of deliberate programs for solving knowledge-intensive tasks, rather than relying solely on end-task prompts. An example of an end-task prompt from infrastructure finance could be: "Predict the credit rating of a proposed transportation project company based on its financial indicators and risk factors."
Language models are AI models that can understand and generate human language, while retrieval models are designed to retrieve relevant information from large datasets. In this framework, the language model (LM) and retrieval model (RM) work together to retrieve specific passages of information and generate meaningful outputs, without requiring hand-labeled examples of intermediate steps.?
By combining techniques from retrieval-augmented natural language processing (NLP) and in-context learning, the DSP framework enables machine learning experts and domain specialists to develop AI systems with reduced deployment overheads (the resources needed to implement and use the system) and annotation costs (the effort and time required to label data for training).
Overall, the contributions of this work include advocating for task-aware strategies in in-context learning, introducing the DSP framework as a means to express these strategies, demonstrating the power of composability in building complex pipelines, and achieving state-of-the-art results for knowledge-intensive tasks through rich DSP programs.
DSP program: The language model generates text based on instructions, Retrieval model retrieves relevant sequences. Pretrained models aid accuracy but adaptability is limited.
In a DSP program, there are two main components: the language model (LM) and the retrieval model (RM). The LM is a frozen language model that generates or scores text based on specific instructions and examples. It is used to answer questions or generate queries by adapting the prompt given to it. The LM, in our case, InfraOptimus, the proposed large language model for infrastructure finance, plays a crucial role in the program by generating the final answer to a question, intermediate queries to find useful information, and exemplar queries to illustrate how to produce queries for training purposes.
On the other hand, the RM is a frozen retrieval model that retrieves the most relevant text sequences for a given query. It can search through a large collection of pre-defined passages and estimate the relevance or similarity of a text sequence to a query. The RM is responsible for retrieving passages during the search stage and within demonstrations. Its purpose in demonstrations is to help the LM adapt to the domain and task rather than providing directly relevant information to the input question.
Although not used in this particular example, the RM has additional functions in DSP programs. It can retrieve nearest-neighbor demonstrations from training data and select well-grounded generated sequences from the LM during the prediction stage.
A frozen LM means that the language model is pretrained and its parameters are fixed or "frozen" during the execution of the DSP program. Pretraining involves training the LM on a large dataset to learn language patterns, grammar, and general knowledge. Once the LM is trained, its parameters are saved and remain unchanged when using it in a DSP program. The frozen LM is used to generate or score text based on the provided prompts and instructions.
Similarly, a frozen RM means that the retrieval model is also pretrained and its parameters are fixed or "frozen" during the execution of the DSP program. The retrieval model is trained to index and retrieve relevant passages or text sequences based on queries. Once trained, the parameters of the RM are saved and remain unchanged during the DSP program. The frozen RM is responsible for retrieving the most relevant text sequences for a given query.
By using frozen LM and frozen RM, the DSP program benefits from the knowledge and expertise learned during the pretraining phase. These pretrained models provide a strong foundation for the program to generate accurate answers, retrieve relevant information, and adapt to the given task.
The downside of using isolated frozen language and retrieval models is that they are limited by the knowledge and expertise learned during the pretraining phase. These standalone models may not be able to adapt well to new or specific domains or tasks. They rely solely on the information and patterns they learned from the pretraining data and cannot be fine-tuned or updated during the execution of a program. This lack of adaptability can result in suboptimal performance or inaccuracies when dealing with novel or specialized contexts. Additionally, frozen models may not capture the most up-to-date information or trends, as they are fixed at a certain point in time and do not incorporate real-time updates.
In summary, a DSP program defines how the LM and RM communicate with each other. The LM generates text based on instructions and examples, while the RM retrieves relevant text sequences for queries. Together, they enable the DSP program to generate accurate answers, find useful information, and adapt to the domain and task at hand.
Example of a DSP program investigating a portfolio of water treatment plants
Suppose you are working on a DSP program in infrastructure finance that aims to provide investment recommendations for a portfolio of water treatment plants. The program utilizes InfraOptimus and an RM.
In this scenario, the RM has an additional function of retrieving nearest-neighbor demonstrations from the training data. These demonstrations consist of successful examples of investment recommendations from previous water treatment assets in the infrastructure finance domain. The RM analyzes the queries or prompts provided by the LM and retrieves the most similar or relevant demonstrations from the training dataset.
During the prediction stage, InfraOptimus generates potential investment recommendations based on the given inputs and instructions. However, the RM comes into play to ensure the generated recommendations are well-grounded and aligned with successful past examples. It selects well-grounded generated sequences from InfraOptimus by comparing them to the nearest-neighbor demonstrations retrieved earlier.
By leveraging the RM's ability to retrieve nearest-neighbor demonstrations and selecting well-grounded generated sequences, the DSP program can enhance the quality and reliability of its investment recommendations. The RM helps validate and refine the recommendations by comparing them to real-world successful examples of water treatment portfolios, ensuring that the program suggests investment strategies that have proven effective in similar contexts.
This hypothetical example demonstrates how the RM's additional functions in a DSP program can contribute to making more informed and reliable predictions in infrastructure finance. By utilizing nearest-neighbor demonstrations and selecting well-grounded sequences from the LM, the program benefits from the wisdom and success of past investment recommendations, improving the overall accuracy and confidence of its predictions.
DSP framework in Python enables complex interactions between the LM and RM. Program learns strategies from training examples to answer questions, using DEMONSTRATE, SEARCH, and PREDICT transformations.
The researchers have implemented the DSP framework in Python. They explain the core data types and functions provided by the framework and showcase code snippets to demonstrate its capabilities. The DSP framework allows for complex interactions between the LM and RM in simple programs.
To perform a task, a DSP program works with instances of the Example datatype. An Example is similar to a Python dictionary with multiple fields. The program is given a few training examples, each containing a multi-hop question and its short answer. The program uses these examples to learn strategies for answering questions by following intermediate steps, even if those steps are not explicitly labeled in the training data.
A complete DSP program takes an input question and produces a short answer. The program creates an Example for the question and assigns the training set to it. The program uses built-in functions called DSP primitives to define the DEMONSTRATE, SEARCH, and PREDICT transformations. These primitives are used to specify the steps involved in a DSP program.
Transformations are functions that take an Example as input and modify it by adding or modifying fields. The program uses three developer-defined transformations: multihop_demonstrate, multihop_search, and multihop_predict. Transformations can invoke other transformations, similar to layers in deep neural network programming. They pass text data between each other and do not involve backpropagation.
The transformations in DSP are categorized into three stages: DEMONSTRATE, SEARCH, and PREDICT. Each stage serves a specific purpose in the program. However, it is also possible to create functions that blend these stages together. The next sections will discuss each of the three stages in more detail.
领英推荐
Behavior examples in prompt improve LM. DEMONSTRATE stage adds annotations, enabling complex strategies without labeling intermediate steps.
Including examples of desired behavior from the LM in its prompt improves performance. In the DSP framework, a demonstration is a prepared training example that illustrates specific behaviors expected from the LM. The DEMONSTRATE transformation takes an Example as input and generates a list of demonstrations, often by selecting a subset of training examples and adding new fields to them.
The DEMONSTRATE stage enhances training examples by programmatically adding annotations for intermediate transformations. In the "multi-hop" example, demonstrations show how to break down a question, ask follow-up questions, and use gathered information to answer complex questions.
The annotate primitive applies a user-defined transformation to a list of training examples, caching the intermediate predictions. These predictions serve as successful demonstrations for pipeline transformations. The annotate primitive is similar to a specialized map function, leveraging the LM and RM to bootstrap annotations for the full pipeline from end-task labels.
DEMONSTRATE allows us to explore complex strategies in SEARCH and PREDICT without creating examples for every transformation. This is different from traditional methods that require custom annotations for each step. DSP enables building complex pipelines without labels for intermediate steps by composing small transformations.
The DSP framework provides three primitives for selecting subsets of training examples: sample, knn, and crossval. Sample randomly selects k demonstrations from the training set. Knn selects the k nearest neighbors to the input text based on the RM's representations. Cross-validation selects from multiple sampled sets of demonstrations.
By combining these selection and bootstrapping primitives, larger strategies can be developed. For very large training sets, knn can be used to incrementally learn from the nearest examples in real-time. For moderately large sets, cross-validation can evaluate prompts on each training example and inform the system's adaptiveness at test time.
Overall, the DSP framework allows for the creation of complex strategies without hand-labeling intermediate transformations. It introduces modularity, and developers can swap training domains, update examples, and modify program strategies while automatically populating intermediate fields for demonstrations.
The DEMONSTRATE stage using project feasibility as an example
In the context of infrastructure finance, let's consider an example of using the DSP framework to improve performance in analyzing project feasibility. To enhance InfraOptimus's performance, we can include examples of desired behavior in its prompt. In the DSP framework, a demonstration refers to a prepared training example that illustrates specific behaviors we expect from InfraOptimus.
During the DEMONSTRATE transformation stage, we take an Example, which represents a project scenario, as input and generate a list of demonstrations. These demonstrations showcase how InfraOptimus should break down a complex financial question, ask follow-up questions to gather relevant information, and ultimately provide an accurate analysis of the project's feasibility.
To further improve the training examples, we can apply the annotate primitive, which uses a user-defined transformation to the list of training examples. This transformation caches intermediate predictions and serves as a successful demonstration for subsequent pipeline transformations. The annotate primitive leverages both InfraOptimus and a retrieval model to generate annotations for the full pipeline based on end-task labels.
One advantage of using the DEMONSTRATE stage is that it allows us to explore complex strategies in the SEARCH and PREDICT stages without creating examples for every transformation. This differs from traditional methods that require custom annotations for each step. By composing small transformations, the DSP framework enables the construction of complex pipelines without the need for labeling intermediate steps.
The DSP framework provides three primitives for selecting subsets of training examples: sample, knn, and crossval. The sample primitive randomly selects a subset of demonstrations from the training set. The knn primitive selects the k nearest neighbors to the input text based on the retrieval model's representations. Cross-validation selects from multiple sampled sets of demonstrations, allowing for a more comprehensive evaluation.
By combining these selection and bootstrapping primitives, developers can develop larger strategies. For very large training sets, the knn primitive can be used to incrementally learn from the nearest examples in real-time. For moderately large sets, cross-validation can evaluate prompts on each training example and inform the system's adaptiveness during testing.
Overall, the DSP framework enables the creation of complex strategies in infrastructure finance without the need for manual labeling of intermediate transformations. It introduces modularity, allowing developers to swap training domains, update examples, and modify program strategies while automatically populating intermediate fields for demonstrations. This enhances the efficiency and effectiveness of analyzing project feasibility in infrastructure finance using InfraOptimus and the DSP framework.
SEARCH in DSP gathers passages to support LM transformations. It enables complex strategies, fusion techniques, and compositions.
The SEARCH stage in the DSP framework gathers passages to support the transformations conducted by the LM. It assumes a large knowledge corpus, like web snippets, Wikipedia, or arXiv, divided into text passages. This facilitates factual responses, allows for updating the knowledge store without retraining, and provides transparency. In simple scenarios, SEARCH can directly query a retrieval model (RM) for the top-k passages that match a question.
In more complex tasks, sophisticated SEARCH strategies are necessary to empower the RM in finding relevant passages. Examples that require multi-hop reasoning or conversational challenges demand advanced strategies. Previous research in retrieval-augmented natural language processing (NLP) has explored multi-hop and conversational search pipelines, often relying on hand-labeled query rewrites, decompositions, or target hops. With automatic annotations from DEMONSTRATE, SEARCH in DSP can simulate and explore various strategies by passing queries, passages, and demonstrations between the RM and LM.
A comparison is made between DSP's multi-hop program and the recent "self-ask" prompting technique. Self-ask is a simplified instantiation of DSP's SEARCH stage, where the LM asks follow-up questions that are sent to a search engine. DSP, as a general framework, can express ideas like self-ask and more sophisticated pipelines but offers advantages like modularity, automatic prompt updates, and better control flow. DSP programs are developed without labeling intermediate transformations and avoid the "self-distraction" problem encountered in self-ask.
To improve recall and robustness, retrieval results can be fused across multiple generated queries. Fusion, a technique in information retrieval, combines multiple retrieval lists into one. DSP includes a fused_retrieval primitive to facilitate this fusion process.
Compositions and extensions in DSP allow for combining different steps and transformations. For example, a chatbot can be equipped with conversational multi-hop search by combining query rewriting and multi-hop transformation. Similar approaches can be used for spell correction or implementing pseudo-relevance feedback.
Overall, DSP provides a framework to enhance the SEARCH stage by exploring sophisticated strategies, incorporating fusion techniques, and allowing compositions and extensions for various NLP tasks.
DSP’s SEARCH stage using renewable energy as an example?
Imagine you are working on a DSP program for infrastructure finance that aims to provide insights into renewable energy projects. The program utilizes a language model (LM) called InfraOptimus and a retrieval model (RM) for gathering relevant passages to support the transformations conducted by the LM.
In this example, the SEARCH stage plays a crucial role in finding passages from a large knowledge corpus related to renewable energy, such as web snippets, Wikipedia articles, or research papers from arXiv. The knowledge corpus is divided into text passages, making it easier to retrieve specific information. This division into passages facilitates factual responses, allows for updating the knowledge store without retraining the models, and provides transparency in terms of the information sources used.
In simple scenarios, the SEARCH stage can directly query the RM to obtain the top-k passages that match a given question. For instance, if the LM receives a question about the environmental impact of solar energy, the RM can retrieve the most relevant passages from the knowledge corpus that contain information about this topic.
However, in more complex tasks that involve multi-hop reasoning or conversational challenges, sophisticated SEARCH strategies are necessary to empower the RM in finding relevant passages. For example, if the LM receives a question about the economic viability of wind farms in a specific region, the SEARCH stage needs to employ advanced strategies to gather passages that address this query in a multi-step manner, considering factors like cost analysis, government policies, and local market conditions.
Previous research in retrieval-augmented natural language processing (NLP) has explored multi-hop and conversational search pipelines, often relying on manually labeled query rewrites, decompositions, or target hops. In contrast, with the automatic annotations from the DEMONSTRATE stage in DSP, the SEARCH stage can simulate and explore various strategies by exchanging queries, passages, and demonstrations between the RM and LM.
To improve recall and robustness, the retrieval results can be fused across multiple generated queries. Fusion, a technique in information retrieval, combines multiple retrieval lists into one comprehensive list, ensuring that relevant information from various sources is considered. The DSP framework includes a fused_retrieval primitive to facilitate this fusion process, enabling the program to leverage multiple generated queries and aggregate the retrieved passages effectively.
Furthermore, compositions and extensions in DSP allow for combining different steps and transformations to address specific NLP tasks. For instance, by combining query rewriting and multi-hop transformation, a chatbot in the infrastructure finance domain can be equipped with conversational multi-hop search capabilities, enabling it to engage in a dialogue with users and retrieve relevant information in a multi-step manner.
In summary, the SEARCH stage in the DSP framework is responsible for gathering relevant passages to support the LM's transformations. It utilizes strategies to retrieve passages from a large knowledge corpus, employs advanced techniques for complex tasks, and incorporates fusion methods to improve recall and robustness. With compositions and extensions, the SEARCH stage can be customized to address specific NLP tasks in infrastructure finance and other domains.
PREDICT in DSP generates reliable system output by aggregating information from demonstrations, passages, and predictions.
The PREDICT stage in DSP generates the system output using demonstrations and passages. Its main goal is to reliably solve the downstream task by aggregating information from multiple demonstrations, passages, and candidate predictions.
In PREDICT, the generate primitive is used to produce one or more candidate predictions for the end-task. It queries the LM to generate completions based on a template and example. The rank primitive, on the other hand, uses the RM to determine the relevance scores of a query and passages.
Multiple candidate predictions can be generated by sampling from the LM. Selecting the best prediction among them is a topic of decoding research. In DSP, strategies for selecting predictions and aggregating information involve the LM and RM.
One strategy is to extract the most popular prediction among the candidates. This method ensures self-consistency when a chain-of-thought rationale is used to arrive at the answer. DSP extends this by allowing multiple pipelines of transformations (PoT) within the program, which can involve different paths.
When dealing with a small number of demonstrations or passages, they can be concatenated into the prompt. If there are more demonstrations or passages, the program can branch in parallel to process subsets of them and then aggregate the individual answers using scoring methods. Alternatively, information can be accumulated sequentially across passages, as done in the multi-hop approach.
DSP provides various aggregation strategies to handle different scenarios, considering the number of demonstrations, passages, and the need for sequential or parallel processing. These strategies enhance the reliability of the PREDICT stage in generating candidate predictions for the end-task.
DSP’s PREDICT stage using the financial viability of a toll road project connecting a city to an airport as an example?
Imagine you are developing a DSP program for infrastructure finance that aims to predict the financial viability of a toll road project connecting a city to an airport. The program utilizes InfraOptimus and a RM to generate the system output.
In the PREDICT stage of the DSP program, the goal is to reliably determine the financial viability of the toll road project by aggregating information from multiple demonstrations, passages, and candidate predictions. The generate primitive is used to produce one or more candidate predictions by querying the LM based on a template and example. The rank primitive, on the other hand, utilizes the RM to determine the relevance scores of queries and passages.
To generate multiple candidate predictions, the program can sample from the LM using different strategies. Selecting the best prediction among these candidates is an important task. In the case of predicting the financial viability of the toll road project, the program can consider factors such as expected traffic volume, toll rates, construction costs, maintenance expenses, and projected revenue.
One strategy in the PREDICT stage is to extract the most promising prediction among the candidate completions. This can be achieved by considering the financial indicators, such as debt service coverage ratio?(DSCR), project life coverage ratio (PLCR), or loan life cover ratio (LLCR) associated with the toll road project. DSP allows for exploring different prediction selection strategies to find the most reliable and financially sound outcome.
When dealing with a large number of demonstrations or passages related to the toll road project, the program can branch in parallel to process subsets of them and then aggregate the individual answers using scoring methods. This parallel processing allows for the efficient analysis of a significant amount of information, including historical data on similar toll road projects, financial forecasts, and case studies.
DSP provides various aggregation strategies to handle different scenarios based on the number of demonstrations, passages, and the need for sequential or parallel processing. These strategies enhance the reliability of the PREDICT stage in generating candidate predictions for the financial viability of the toll road project.
In summary, the PREDICT stage in the DSP framework focuses on generating the system output by aggregating information from demonstrations, passages, and candidate predictions. It utilizes the generate primitive to produce potential completions and the rank primitive to determine the relevance scores. The program explores strategies for selecting predictions based on financial indicators, such as debt service coverage ratio?(DSCR), project life coverage ratio (PLCR), or loan life cover ratio (LLCR), to assess the financial viability of the toll road project. By branching in parallel and employing aggregation methods, DSP enables efficient analysis of a large amount of data, enhancing the accuracy and reliability of the predictions for infrastructure finance, specifically in assessing the financial viability of a toll road to an airport.
DSP programs outperformed baselines in open-domain, multi-hop, and conversational question answering tasks, showcasing their effectiveness.
In the evaluation section, the authors discuss the implementation and assessment of DSP programs for three different NLP tasks: open-domain question answering (QA), multi-hop QA, and conversational QA. The researchers sought to determine the effectiveness of DSP programs compared to other methods.
For each task, the authors utilized specific datasets. The open-domain version of SQuAD was employed for open-domain QA, the multi-hop HotPotQA dataset for multi-hop QA, and the conversational question answering QReCC dataset for conversational QA. The evaluation primarily focused on measuring the accuracy of the DSP programs on a validation set. Each DSP program underwent training using a 16-shot training set, and the authors reported the average quality across five different training runs.
The DSP programs leveraged pretrained modules, with ColBERTv2 serving as the retriever module (RM) and GPT-3.5 (text-davinci-002) as the language model module (LM). ColBERTv2 was chosen for its effective search quality, while GPT-3.5 was selected for its strong performance. The programs employed various techniques for generating predictions, such as greedy decoding for single predictions and sampling with a temperature of 0.7 for multiple predictions.
To compare the effectiveness of the DSP approach, the authors employed several baselines: Vanilla LM, Retrieve-then-Read, and Self-ask. The Vanilla LM baselines followed a few-shot learning paradigm, Retrieve-then-Read used the RM for passage retrieval, and Self-ask represented a specific implementation of the SEARCH stage in DSP. These baselines served as reference points for evaluating the performance of the DSP programs.
Each DSP program for the different tasks possessed unique structures and transformations. In open-domain QA, the SEARCH stage focused on retrieving relevant passages, the PREDICT stage generated reasoning chains, and the DEMONSTRATE stage incorporated demonstrations. In multi-hop QA, similar approaches were applied, but with the inclusion of result fusion and multiple hops. In conversational QA, the PREDICT stage generated responses considering previous turns and retrieved passages, the SEARCH stage involved query rewriting, and the DEMONSTRATE stage included sampled conversational turns.
The evaluation results consistently demonstrated that the DSP programs outperformed the baselines. In Open-SQuAD, the DSP program achieved a relative gain of 126% in exact match compared to the vanilla LM baseline and outperformed the retrieve-then-read pipeline. In HotPotQA, the DSP program surpassed all baselines by significant margins, illustrating the effectiveness of coordinating the LM and RM. Similar positive results were observed in QReCC. The DSP programs consistently outperformed the baselines and demonstrated competitive performance compared to other approaches, highlighting their effectiveness in these tasks.
DSP framework transforms AI system development by enabling in-context learning, simplifying building, encouraging participation, and improving performance.
Traditionally, AI models were built using tensor representations and highly modular designs. However, these approaches require extensive domain expertise, and challenges arise when combining different pretrained components. In-context learning offers a new paradigm where complex systems can be built using natural language instructions and pretrained models. This paradigm allows for broader participation in AI system development, rapid prototyping, and leveraging specialized pretrained components.
The DEMONSTRATE–SEARCH–PREDICT (DSP) framework introduced in the paper enables retrieval augmented in-context learning. DSP provides simple and composable functions for implementing in-context learning systems as deliberate programs. The framework was implemented as a Python library and used to write programs for various tasks, resulting in significant performance improvements compared to previous approaches. Beyond performance, the key contribution of DSP lies in uncovering a wide range of conceptual possibilities for in-context learning in general.
The DSP framework will significantly impact InfraOptimus's development and architecture by enabling retrieval augmented in-context learning. It simplifies system building, encourages broader participation, and achieves notable performance improvements. It uncovers new possibilities for in-context learning in AI systems.
If you're interested in learning more about the potential use cases of InfraOptimus and the infrastructure finance DSP program, stay tuned for the next article. Please share your thoughts on this article below or in DM. Follow?Infrafintech Intelligence?on LinkedIn.
SOURCE
Omar Khattab,?Keshav Santhanam,?Xiang Lisa Li,?David Hall,?Percy Liang,?Christopher Potts,?and Matei Zaharia,?Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP:?https://arxiv.org/abs/2212.14024v2
David Doré is the founder of Infrafintech Intelligence, a technology market intelligence platform for the infrafintech industry. Infrafintech refers to the application of advanced technologies, including AI, big data analytics, machine learning, natural language processing, and tokenization, to the provision of debt and equity for energy and infrastructure assets such as roads, power plants, and data centers.?Infrafintech Intelligence?connects fintech investors, technologists, and finance professionals to overcome humanity's most pressing challenges in infrastructure. Our actionable data, nuanced insights, and in-depth analysis provide individuals in pursuit of innovative solutions with the intelligence tools to make informed decisions, mitigate risks, and capitalize on emerging opportunities in the fast-evolving world of infrafintech.
Founder| CEO | EO Charleston-Past President
1 年This is fantastic, David.