Navigating AI Complexity with a custom ChainOfThought Class
Chain of Thought multiplied!

Navigating AI Complexity with a custom ChainOfThought Class

Navigating the maze of GPT and LLM applications, this article brings a cost-effective approach to tackle complex problems with higher accuracy. Drawing inspiration from 'Chain of Thought Prompting' and harnessing the power of the popular NLP app framework Langchain, we venture beyond the known. The spotlight shines on the extended Langchain LLMChain class—our unique ChainOfThought class—promising more accurate problem-solving and streamlined debugging. Stay tuned as we first unpack the concept of 'Chain of Thought Prompting', delve into the intricacies of Langchain, and explore the utility of Langchain Chains – crucial stepping stones to understanding and utilizing the unique ChainOfThought class.

Chain Of Thought Prompting

In January 2022, the Google Brain team released a groundbreaking research paper on "Chain of Thought Prompting" (Link to the paper). The technique introduced in this paper is a novel approach to enhance the reasoning capabilities of large language models (LLMs), especially in multi-step reasoning tasks.

In contrast to the standard prompting, where models are asked to directly produce the final answer, 'Chain of Thought Prompting' encourages LLMs to generate intermediate reasoning steps before providing the final answer to a problem. The advantage of this technique lies in its ability to break down complex problems into manageable, intermediate steps. By doing this, the model-generated 'chain of thought' can mimic an intuitive human thought process when working through multi-step problems.

An example of one shot prompting:

No alt text provided for this image
Chain of Thought Prompting as Google Brain example

Or zero shot - chain of thought prompting:

No alt text provided for this image

The 'Chain of Thought' technique has emerged as a significant breakthrough. It possesses the potential to enhance the accuracy of numerous prompts considerably. Nevertheless, it presents us developers with a fresh challenge. The inclination might be to leverage the complete output of the model's prediction within our applications. However, it wouldn't be prudent, as the applications are essentially concerned with the final result, not the steps the LLM model undertook to deduce that final conclusion.

# Incldes reasoning alongside the final result!
result = llm.predict(prompt)         

Can we solve this in a neat way? What if we wish to create a whole pipeline of complex chain of thought problems, passing each step's result to the next?

Langchain

LangChain (link), a library initiated by Harrison Chase in late 2022, serves as a framework built around Large Language Models (LLMs) with numerous applications in chatbots, generative question-answering, summarization, and beyond. Supported in both Python and JS, LangChain's fundamental idea is to allow the chaining of various components to create advanced use cases for LLMs. These components may include prompt templates, LLMs like GPT-3 or BLOOM, developers can build agents that decide on actions using LLMs, and elements of memory.

Langchain Chains

At the core of LangChain is the 'Chain' interface. A Chain is defined as a sequence of calls to components, which could include other chains.

Consider a simple example where we use an LLM (large language model) like GPT-3 to generate a company name given a certain product type. For this, we can use LangChain's LLMChain class, which is designed to chain an LLM with a prompt template.

Here's a basic example:

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
? ? input_variables=["product"],
? ? template="What is a good name for a company that makes {product}?",
)
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.run("colorful socks")) ?# Output: Colorful Toes Co.        

Langchain's SimpleSequentialChain is designed for those scenarios where you need to run multiple chains in a sequence. The aim is not just to execute these chains one after the other, but more importantly, to feed the output of one chain as the input into the next. This makes the SimpleSequentialChain a powerful tool for multi-step problem-solving tasks.

Imagine a relay race, where each runner (chain) runs a stretch (computes a task), then hands the baton (the output) to the next runner. That's what happens in a SimpleSequentialChain. The race starts with an initial input. This input gets processed by the first chain, which creates an output. This output now serves as the input for the next chain in line. This sequence continues until all chains have had their turn, leading to a final output.

For example:

prompt1 = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)
prompt2 = PromptTemplate(
    input_variables=["company_name"],
    template="What is a catchy slogan for {company_name}?",
)

# Define two chains
chain1 = LLMChain(llm=llm, prompt=prompt1)
chain2 = LLMChain(llm=llm, prompt=prompt2)

# Combine the two chains into a SimpleSequentialChain
seq_chain = SimpleSequentialChain(components=[chain1, chain2])

# Run the chain only specifying the input variable for the first chain.
print(seq_chain.run("colorful socks"))  # Output: Colorful Toes Co. - Making Your Feet Happy!        

Output Parsing in LangChain: A Double-Edged Sword

LangChain's output parser (link) is a handy tool for converting well-structured string results, such as those ending with a JSON-like output, into a Python dictionary. This provides an easy way to handle and manipulate the LLM's output. However, this convenience can turn into a hurdle when we venture into more complex territory, particularly with the Chain of Thought Prompting technique.

Despite the flexibility to create a custom regex-based OutputParser, the process can become messy, particularly when parsing a part of the result. Furthermore, integrating this with a SimpleSequentialChain introduces additional complexity. The sequential processing of tasks can lead to situations where the OutputParser cannot accurately parse each chain's output, leading to difficulties in obtaining the desired structured output.

In such intricate scenarios, we need a more flexible and robust solution. This brings us to the proposal of a new class, the ChainOfThought class, designed to seamlessly handle sequential processing and chain of thought prompting. Let's dive into how this class can help mitigate these challenges.

Custom ChainOfThought chain class

In response to the challenges of incorporating the 'Chain of Thought Prompting' technique within sequential language model operations, allow me to introduce the custom Chain of Thought (CoT) class. This class aims to ease the parsing of intermediate reasoning steps and the final result from the model's output, thus producing a structured output better suited to specific application requirements. The CoT class simplifies the handling of complex multi-step reasoning tasks, providing a valuable addition to the toolkit for advanced Large Language Model applications.

class CoTChain(LLMChain)
    def __init__(self, **kwargs):
        # 1. Custom delimiter for your specific format
        result_delimiter = kwargs.pop('result_delimiter', None)
        if not result_delimiter:
            raise ValueError("Missing result_delimiter as the final answer delimiter of the chain of thought")
        super().__init__(**kwargs)
        self.result_delimiter = result_delimiter

    class Config:
        # Allow extra fields like 'result_delimiter'
        extra = Extra.allow

    def _call(self, inputs: Dict[str, Any], run_manager: Optional[CallbackManagerForChainRun] = None,) -> Dict[str, str]:
        response = self.generate([inputs], run_manager=run_manager)
        return self.create_outputs(response, run_manager)[0]

    def create_outputs(self, llm_result: LLMResult, run_manager: Optional[CallbackManagerForChainRun] = None,) -> List[Dict[str, Any]]:
        text = llm_result.generations[0][0].text
        # 2. Auto parsing of chain of thought by delimiter and the actual output
        output, chain_of_thought = self.extract_chain_of_thought_and_result(text)
        result = [{self.output_key: output, "full_generation": text}]

        if run_manager and self.verbose:  # Log chain of thought when verbose = True
            run_manager.on_text(f"\n\n\033[1m> Chain of Thought:\n{chain_of_thought}...\033[0m", end="\n")

        return result

    def extract_chain_of_thought_and_result(self, text):
        # 3. Extension possibilities - a more complex regex parser for your own use case
        cot_pattern = f"(.*){self.result_delimiter}"
        result_pattern = f"{self.result_delimiter}(.*)"

        steps_match = re.search(cot_pattern, text, re.DOTALL)
        result_match = re.search(result_pattern, text, re.DOTALL)

        if steps_match and result_match:
            chain_of_thought = steps_match.group(1).strip()
            output = result_match.group(1).strip()
        else:
            chain_of_thought = ""
            output = text

        return output, chain_of_thought:        

This class offers three major benefits:

  1. Custom delimiter: You can specify a delimiter to separate the chain of thought from the final result.
  2. Automatic parsing: The CoTChain class can automatically parse the model's output into the chain of thought and the final result. In verbose mode, it also logs the chain of thought for debugging purposes.
  3. Extensibility: You can introduce a more complex regex parser for specific use cases, offering flexibility for custom formatting needs.

After all, nothing beats a good example. So, let's imagine we're developing an AI bot designed to assist with a police investigation, where multi-step reasoning is crucial. Our bot will need to play two roles: a 'forensic detective' and an 'investigation coordinator'. Each has their own distinct chain of thought.

No alt text provided for this image
Forensic detective with an Investigation coordinator

Let's take a look at both of the bot roles which use a one shot chain of thought prompt (silly I know ??)

Forensic Detective

You are a forensic detective. Given the forensic evidence, it is your job to infer the possible features of the suspect

Here are some examples of how I infer features from evidence:

Evidence: The suspect left a size 9 shoe print, a rare foreign coin and a receipt from a vegan restaurant.
Inference: 
- From the size 9 shoe print, we can infer that the suspect likely has relatively large feet as size 9 is above average.
- The presence of a rare foreign coin might suggest the suspect has recently travelled abroad or has a keen interest in foreign currencies.
- A receipt from a vegan restaurant could indicate that the suspect follows a vegan diet.
Conclusion: 
1. The suspect has large feet.
2. The suspect may have recently travelled abroad.
3. The suspect is likely vegan.

Now let's analyze a new piece of evidence:
Forensic Evidence: {evidence}
Inference:s
->>>>>>>>>> Models nexe word prediction starts here        

Investigation Coordinator

You are an investigation coordinator. Given the attributes of a suspect, it is your job to suggest possible investigation paths

Here are some examples of how I suggest investigation paths from forensic leads:

Leads: 
The suspect has large feet, may have recently travelled abroad, and is likely vegan.
Inference: 
- From the large feet, we can contact shoe stores that sell larger sizes and inquire about any recent purchases.
- If the suspect has recently travelled abroad, we can coordinate with border agencies to get travel records.
- Being vegan, we could visit local vegan restaurants or grocery stores and inquire if they've noticed any individual who might fit the description.
Conclusion: 
1. Contact shoe stores selling larger sizes.
2. Request travel records from border agencies.
3. Visit local vegan restaurants and grocery stores for potential leads.

Now let's analyze new leads:
Leads:
{forensic_leads}
Inference:
->>>>>>>> Models next word prediction starts here         

Note that in both templates, the keyword "Inference:" is used to initiate the chain of thought that the model needs to generate, and the actual result we are interested in appears after the "Conclusion:" keyword."

In action:

forensic_chain = CoTChain(
    llm=llm, 
    prompt=forensic_detective_template, 
    output_key="forensic_leads", 
    verbose=True, 
    result_delimiter="Conclusion:"
)
coordinator_chain = CoTChain(
    llm=llm, 
    prompt=coordinator_prompt, 
    output_key="follow_ups", 
    verbose=True, 
    result_delimiter="Conclusion:"
)

investigation_chain = SequentialChain(
    chains=[forensic_chain, coordinator_chain],
    input_variables=["evidence"],
    output_variables=["forensic_leads", "follow_ups"],
    verbose=True
)

new_evidence = "The suspect left a size 11 shoe print, a ticket stub from a French film festival, and a membership card to a local astronomy club."
result = investigation_chain(new_evidence)        

Results ??:

***** Evidence ****
The suspect left a size 11 shoe print, a ticket stub from a French film festival, and a membership card to a local astronomy club.
***** Forensic Leads *****
1. The suspect has large feet.
2. The suspect has an interest in French culture or cinema.
3. The suspect has an interest in astronomy or science.<|im_end|>
***** Follow Ups *****
1. Contact shoe stores selling larger sizes.
2. Investigate any recent purchases related to French culture or cinema.
3. Obtain any recent purchases related to astronomy or science.        

While the chain of thought was printed out with vervbose=True for debugging purpose:

> Chain of Thought (Forensic Detective)
- From the size 11 shoe print, we can infer that the suspect likely has relatively large feet as size 11 is above average.
- The ticket stub from a French film festival might suggest that the suspect has an interest in French culture or cinema.
- A membership card to a local astronomy club could indicate that the suspect has an interest in astronomy or science.

> Chain of Thought (Investigation Coordinator)
- From the large feet, we can contact shoe stores that sell larger sizes and inquire about any recent purchases.
- We can also try to investigate any recent purchases related to French culture or cinema such as books, movies, or theatre tickets.
- If the suspect has an interest in astronomy or science, we can try to obtain any recent purchases related to these fields such as telescopes or books on astrophysics        

Final words

Indeed, the power of the Chain of Thought approach extends far beyond the playful example of a police investigation bot. It's a technique with substantial practical applications, particularly in the realm of complex Natural Language Processing tasks, such as Text-to-SQL translation.

Consider this, for instance: Text-to-SQL tasks involve converting natural language queries into their SQL counterparts, a task that can be complex and nuanced. In recent research (link), it was shown that by breaking down this task into smaller, more manageable sub-problems, the performance of Large Language Models (LLMs) on challenging Text-to-SQL datasets, such as Spider, can be significantly enhanced.

Cost-effectiveness is another aspect worth highlighting. While the superior performance of GPT-4 is widely acknowledged, it comes with higher costs which might not be feasible for all users or projects. In such cases, the Chain of Thought technique offers a compelling alternative. By enhancing the reasoning capabilities of the more affordable GPT-3.5 model, we can approach complex tasks with a level of effectiveness that might rival that of GPT-4.

The potential for real-world applications is exciting. When combined with the Chain of Thought technique multiple times, we can push the boundaries of what LLMs can accomplish, enabling them to handle complex reasoning tasks with more finesse. Thus, I look forward to further explorations and innovative uses of this technique in various AI applications.

Feel free to reach out if you have any thoughts!

Thanks,

Shai

Bar Mor yosef

Cyber Security Consultant & CISO/DPO | Business Management & Information Systems Graduate with a Specialization in Cybersecurity

1 年

Helpful! And interesting Excellent article.

回复
Yosef Arbiv

Engineering Group Leader | CNCF Ambassador | Public Speaker

1 年

??????

回复
Jonathan (Yoni) Stoller

Machine Learning | Engineering | Innovation

1 年

Excellent article, really informative!

回复
Alon Cohen

Software Developer at Sygnia

1 年

Super interesting and well written, Great article Shai!

回复
Daniel Bayer

Cyber Security Analyst

1 年

Great read!

回复

要查看或添加评论,请登录

Shai Volvovsky的更多文章

社区洞察

其他会员也浏览了