登录查看更多内容

Chain of Thought: Verification and Density Lead to Smarter LLM response

Ankit Pareek

Driving Digital Innovation

发布日期: 2023年10月30日

LLMs are capable of generating human-quality text, code, and other creative content. However, they are also prone to hallucinations, or the generation of plausible but incorrect factual information. This is a major challenge for the widespread use of LLMs in many applications.

Two promising new techniques, Chain-of-Verification (CoVe) and Chain-of-Density (CoD), are poised to address some of the key challenges facing LLMs. CoVe is a structured process that helps LLMs to double-check their own work, ensuring that their responses are accurate and reliable. CoD, on the other hand, is a technique for controlling the density of information in generated text, ensuring that summaries are neither too sparse nor too dense.

Before delving into the solutions provided by CoVe and CoD, it's crucial to understand the challenges that LLMs currently face.

Hallucinations: One of the foremost challenges is the generation of plausible yet incorrect factual information, often referred to as "hallucinations." LLMs have a tendency to generate information that sounds reasonable but is, in fact, inaccurate. This issue poses a significant obstacle in achieving reliable AI-generated text.
Information Density: In tasks such as text summarization, determining the right balance of information density is a challenge. Summaries that are too sparse may lack essential details, while overly dense summaries can be overwhelming. Striking the right balance is vital for comprehension and usability.
Task-Specific Quality: LLMs may excel in some tasks but fall short in others. Achieving consistent quality across diverse tasks remains a challenge. Task-specific nuances often require tailored approaches.

CoVe is a four-step process that helps LLMs to generate more reliable responses:

Drafting responses: The LLM generates an initial response to the prompt.
Planning verification questions: The LLM identifies key factual claims in its response and generates a set of verification questions.
Answering them independently: The LLM answers the verification questions independently, using its knowledge base and external sources.
Generating a final verified response: The LLM generates a final response based on the initial response and the answers to the verification questions.

The COD Process is a step-by-step approach to generating summaries with a desired level of information density:

Identify the Text for Summarization: Choose the document, article, or any piece of text that you wish to summarize.
Craft the Initial Prompt: Create an initial summarization prompt tailored to the selected text. The aim here is to guide the LLM towards generating a basic summary.
Analyze the Initial Summary: Review the summary generated from the initial prompt. Identify if the summary is too sparse (missing key details) or too dense (containing unnecessary details).
Design Chained Prompts: Based on the initial summary's density, construct additional prompts to adjust the level of detail in the summary. These are the "chained prompts" and are central to the Chain of Density technique.
Execute Chained Prompts: Feed these chained prompts back to the LLM. These prompts are designed to either increase the density by adding essential details or decrease it by removing non-essential information.
Review the Adjusted Summary: Examine the new summary generated by executing the chained prompts. Ensure that it captures all essential points while avoiding unnecessary details.
Iterate if Necessary: If the summary still doesn't meet the desired criteria for information density, return to step 4 and adjust the chained prompts accordingly.
Finalize the Summary: Once the summary meets the desired level of information density, it is considered finalised and ready for use.

Example

Let's start our exploration with a practical example from the field of banking. Imagine a customer walks into a bank or accesses their online services and poses a common query: "Can you tell me the process for applying for a credit card?" This seemingly straightforward question highlights some of the complexities and challenges that LLMs must overcome in delivering accurate, reliable, and appropriately dense responses.

Customer Query: "Can you tell me the process for applying for a credit card?"

Without Using CoVe or CoD:

Response: "To apply for a credit card, you can visit our website or one of our local branches. The eligibility criteria include a minimum age of 21, a stable source of income, and a good credit score. You will need to fill out an application form and provide necessary documents. After submission, our team will review your application, and if approved, you will receive your credit card in 7-10 business days."

Using CoVe:

CoVe Prompt:

"Explain the process for applying for a credit card."

Initial Draft: "To apply for a credit card, you can visit our website or a local branch. You should be at least 21 years old, have a steady income, and a good credit score. Fill out an application and provide required documents. Our team reviews the application, and if approved, you'll receive the card in 7-10 business days."
Verification Questions:> Was the minimum age requirement stated accurately?

>Is the requirement for a steady income mentioned?

>Is there information about required documents?

While accuracy is a crucial aspect, the challenge of information density cannot be overlooked. In tasks like text summarisation, striking the right balance between being too sparse and overly dense is essential for comprehensibility and usability. This is where CoD comes into play.

Pavan Belagatti 5 个月前

Document Verification: Meaning & Importance

Incode 2 个月前

AI Bloating Your Technical Debt? What You Need to Know…

Virtasant 3 个月前

CoD Prompt:

You will generate increasingly concise, entity-dense summaries of the provided information.

Repeat the following 2 steps 5 times.

Step 1. Identify 1-3 informative Entities ("; " delimited) from the Initial Summary which are missing from the previously generated summary. Step 2. Write a new, denser summary of identical length which covers every entity and detail from the previous summary plus the Missing Entities.

A Missing Entity is:

Relevant: to the credit card application process.
Specific: descriptive yet concise (5 words or fewer).
Novel: not in the previous summary.
Faithful: present in the Initial Summary.
Anywhere: located anywhere in the Initial Summary.

Guidelines:

The first summary should be long (4-5 sentences, ~80 words) yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., "the application process involves") to reach ~80 words.
Make every word count: rewrite the previous summary to improve flow and make space for additional entities.
Make space with fusion, compression, and removal of uninformative phrases like "you must be 21 or older".
The summaries should become highly dense and concise yet self-contained, e.g., easily understood without the Initial Summary.
Missing entities can appear anywhere in the new summary.
Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Remember, use the exact same number of words for each summary.

By combining CoVe and CoD, we can create LLMs that are able to:

Generate accurate and reliable responses, even to complex and challenging queries.
Produce informative and concise summaries of long texts, tailored to the specific needs of the user.

This combination of techniques has the potential to transform the way we interact with computers and information.

For example, imagine a future where LLMs are used to generate personalized news summaries, translate languages in real time, and even write creative content such as poems and stories. With CoVe and CoD, these LLMs could be trusted to provide us with accurate and reliable information, tailored to our individual needs and interests.

The journey of LLMs is far from over, and the future holds exciting possibilities:

Efficiency and Accessibility

Researchers are actively exploring ways to make LLMs more efficient and less computationally expensive. This effort aims to democratize access to LLM technology and reduce the barriers to entry for innovation.

Ethical AI

Ethical considerations will continue to be a central focus. LLMs will undergo rigorous training and fine-tuning with a strong emphasis on ethical principles, reducing biases, promoting fairness, and ensuring responsible AI usage.

Task-Specific Quality

Future advancements will lead to LLMs that excel in a wide range of tasks. They will adapt seamlessly to new domains and demonstrate consistent quality in diverse applications.

In the meantime, by using CoVe and CoD, we can ensure that LLMs generate outputs that are truthful, coherent, respectful, diverse, and novel.

? Gemini - Toward Singularity

872 位关注者

Venugopal Murali Prabhu (Venu)

Chief Technology Officer, CX Evangelist, Next Gen CX Solutions

1 年

Good Article Ankit Pareek, recently only reading about all these. Awesome one and every day this is evolving. Additional reading https://learnprompting.org/docs/intro #genai #generativeai #promptengineering #ai #aibrains #learngenerativeai

1 次回应

查看更多评论

要查看或添加评论，请登录

Ankit Pareek的更多文章

Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

2024年6月24日

Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

In 15 months, Large Language Models like GPT-4 have surged in prominence, boasting parameter counts that exceed a…
Why CEOs are embracing this Gen AI feature more than anything else

2024年4月29日

Why CEOs are embracing this Gen AI feature more than anything else

In a recent survey by PwC, it was revealed that within just one year of its launch, over 54% of companies have…

6 条评论
LLM Orchestration: The Secret Weapon of Enterprise AI

2024年3月26日

LLM Orchestration: The Secret Weapon of Enterprise AI

LLM orchestration addresses the challenges of deploying and managing generative AI solutions in today's dynamic…
Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

2024年3月17日

Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

Hey there! Ankit Pareek here, ready to drop some knowledge that’s going to help your enterprise get more traction on…
The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

2024年3月4日

The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

In recent weeks, a high-profile incident in Hong Kong has garnered sustained attention as a finance worker fell victim…
Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

2024年2月4日

Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

In a digital era overflowing with choices, the impact of our purchasing habits on our lives has never been more…
The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

2024年1月29日

The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

In 2017, a girl set tongues wagging with her Tinder profile. Forget swiping – she sought a travel companion for a…
Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

2023年12月18日

Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

What the vision tools have lacked so far, was the ability to be multi-modal and doing the Image analysis, object…

3 条评论
Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

2023年12月4日

Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

To tackle the limitations inherent in LLMs, a pivotal strategy involves augmenting prompts with pertinent data…
Beyond Characters and Tokens : GenAI Pricing Impact Users with Visual Impairment

2023年10月17日

Beyond Characters and Tokens : GenAI Pricing Impact Users with Visual Impairment

Google and OpenAI are two major players in the field, each offering different pricing models that have distinct…

2 条评论

See all articles

Chain of Thought: Verification and Density Lead to Smarter LLM response

Ankit Pareek

Driving Digital Innovation

Example

领英推荐

? Gemini - Toward Singularity

872 位关注者

Ankit Pareek的更多文章

社区洞察

其他会员也浏览了

Know it all! Quite Rightly Terrible Language and Phrases Explained

Keeping up with BAPLIE Evolution

Challenges and Solutions for Deploying LLM Agents in Production

Our Take on Document Processing

Data Compression Claims Deemed Ineligible by Federal Circuit: Impact on Software-Based Inventions

9 Things I wish I knew before building RAG Apps

Executive Moves at NIST, Second Front, MRIGlobal, BAE, and Empower AI

Defining the purview of an Algorithmic Assessment - Process

The Rise of RegTech: Revolutionising Regulatory Compliance in the Financial Sector

Empower Your Business with Intelligent Document Processing (IDP) Solutions

Example

领英推荐

? Gemini - Toward Singularity

872 位关注者

Ankit Pareek的更多文章

Future of Generative AI for Enterprises: The Game-Changing Potential of Small Language Models

Why CEOs are embracing this Gen AI feature more than anything else

LLM Orchestration: The Secret Weapon of Enterprise AI

Turn Social Noise into Smart Engagement: Your RAG Powered Listening & Response Bot

The $25 Million Lesson: Hands-On Demos for Spotting & Stopping Voice Cloning & Deepfakes

Would You Conduct a Quality Audit on Your Buying Behaviour/ Procurement? Exploring Mindful Purchases with Generative AI Guidance

The Algorithmic Allure: Decoding the Psychological Drivers of Chatbot-Influenced Upselling and Cross-Selling in Retail

Gemini Vision Pro: The Vision of a Commodified Future for Productivity and Object detection Tools

Generative QA : Retrieval-Augmented Generation (RAG) Systems, RAG Triads and TruLens

Beyond Characters and Tokens : GenAI Pricing Impact Users with Visual Impairment

社区洞察

其他会员也浏览了

Know it all! Quite Rightly Terrible Language and Phrases Explained

Keeping up with BAPLIE Evolution

Challenges and Solutions for Deploying LLM Agents in Production

Our Take on Document Processing

Data Compression Claims Deemed Ineligible by Federal Circuit: Impact on Software-Based Inventions

9 Things I wish I knew before building RAG Apps

Executive Moves at NIST, Second Front, MRIGlobal, BAE, and Empower AI

Defining the purview of an Algorithmic Assessment - Process

The Rise of RegTech: Revolutionising Regulatory Compliance in the Financial Sector

Empower Your Business with Intelligent Document Processing (IDP) Solutions