登录查看更多内容

Chatting with Documents: The Future of Search in 2023

Rob Grzywinski

Working on some seriously cool stuff!

发布日期: 2023年3月22日

The big thing that you're going to see in 2023 (other than the advancement of Large Language Models (LLMs) in general) is the ability to chat with documents. Bing's new ChatGPT-powered search is a great example of this. Microsoft has extended ChatGPT's "memory" with all of the documents that Bing has crawled across the internet. If you work with a lot of research PDF's like I do, ChatPDF is invaluable. ChatPDF allows you to upload a PDF and then ask questions of it via ChatGPT. The Bing extension available in Edge combines the features of both ChatGPT + Bing as well as ChatPDF. (Personal aside: I prefer ChatPDF over Edge + Bing + PDF in this case as Bing may also try to search the internet when I want it to only focus on a single document. This interface will absolutely get better over time but for now but in this case ChatPDF as a unitasker is perfect for me!)

(You can try out ChatPDF on this post if you'd like! Do a "Print ..." "to PDF" on this page and upload into ChatPDF. Your milage may vary simply because LinkedIn does not print well to PDF. ChatPDF works quite well with research PDFs.)

No alt text provided for this image — Example use of ChatPDF on this post

Let's dig into how Bing or ChatPDF might work. Remember that ChatGPT only knows what you tell it or what it has been trained on (which cuts off around September 2021). As I covered in my "In-Context Learning" post, ChatGPT has no memory per se. Everything you might want to include in your conversation with ChatGPT must fit within its context window. So how big is this context window? It depends on which model you're using. GPT-3.5 is 4k tokens (around 3k words or 6 pages of text), GPT-4 is currently 8k tokens (around 6k words or 13 pages of text) and OpenAI is working on a 32k token (around 24k words or 50 pages of text!!!) version. (I've started to include the token counts at the bottom of each of my posts so you start to get a feel for token lengths. A page of text is around 500 words single-spaced.)

So is that it? Are we simply relying on OpenAI to increase the token count so we can jam our documents into ChatGPT and reach document-chat nirvana? Nope! While many documents will comfortably fit within the current context windows (including the up-coming 8k token window), remember that everything that you want ChatGPT to know must fit within the window. That means both the document and any chat that you have. If you have a long conversation, large documents, or if you want to include multiple documents then larger context windows simply aren't going to cut it.

Let's take a concrete problem so we can walk through it: we want to take the dozen or so posts that I've written so far (as of mid-March 2023) and ask questions of them. For example, we remember something about how ChatGPT "remembers" conversations from the posts but we don't remember what it was. So we want to ask ChatGPT (in the context of the posts) "How does ChatGPT remember what I said in a conversation?".

领英推荐

SearchGPT: A New Competitor for Google?

Linguaserve 3 周前

natlagram: How We Translated Words to Diagrams With…

itemis 1 年前

Transformer Architectures for Dummies - Part 2…

Multicloud4U? Technologies 9 个月前

How might we do this with the tools we have today?

The easiest way would be to take the text from all of the posts and combine them into one long string of text and give it to ChatGPT as the first turn of the conversation. Unfortunately, all of the posts together combine to around 10k tokens total and it won't fit in either ChatGPT-3.5 or the current 8k token ChatGPT-4.
We could give ChatGPT each post one by one, ask it the question and if it doesn't know then we clear the chat and try with the next document. This approach might work well enough for the first question but if we want to do follow-ups, especially if those follow-ups refer to other posts, then we're going to be doing a lot of cutting and pasting. Not infeasible but if the document doesn't fit or if the conversation is too long then we're out of luck.
We could start to get fancy and first give each post to ChatGPT to summarize and then combine those summaries together for the context. This isn't a bad approach in general but depending on how important the information is within the post, the summary may not contain the desired information. In our case, when I pass the contents of "In-Context Learning" through ChatGPT to summarize, I get the following: "The post discusses Large Language Models (LLMs) and how they are trained to predict the next word in a sentence by learning from a large corpus of text. The post also explores how LLMs generalize and learn to remember new phrases. The post explains that while LLMs can be used in conversations, they do not have a working memory, and their responses are solely based on the conversation at hand. The post also discusses how LLMs are not entirely understood, and their inner workings cannot be controlled. The post concludes by discussing recent developments in in-context learning, which allows LLMs to rewire themselves at runtime to better answer questions." It doesn't look like this would be able to answer our question. We could spend the time tweaking the summary prompt to get it to provide better fidelity but it's always possible that the relevant content isn't present in the summary. (Of course if the post doesn't fit in the context window then we have to get really really fancy and divide the post up in the chunks, feed those into the ChatGPT to summarize and then summarize and/or combine the summaries.)
Assuming that the posts have been indexed by Google, we could enter our question into Google and limit the search results to only the posts. (Angry aside: it seems that Google's indexing of LinkedIn posts is spotty at best! I honestly did not know that until literally this moment. It's 2023 for pity sake! Grrr!) We could then take the results of that search and put them into ChatGPT and have it generate a response as a complete sentence. If we don't get any results from Google based on our exact question, we could ask ChatGPT to rephrase our question and submit that. This approach has pros and cons associated with it. A big warm fuzzy pro is that we're gotten really really good at search over the past two decades! Another pro is that in addition to the sentence(s) that you've searched for, search can give you text around the result as context. This context is great for your Q&A with ChatGPT. But this context can also be a con. Depending on where the context is cut-off, you might miss something important. Let's say that the text around the relevant sentences is ""But when I use ChatGPT it remembers what I said earlier in the conversation!" Everything that it knows and learns about your conversation is solely contained within that conversation." If you look back at the original post the first sentence refers to something that isn't true but you wouldn't know that by looking only at a narrow window. But if that sentence is included (without context) into your ChatGPT conversation then ChatGPT can easily get confused!

(Prompting aside: To get ChatGPT to answer my questions, I would use a prompt along the lines of "I want you to act as a question and answer chatbot. I will ask the questions and you will provide the answers. You may ask only clarifying questions if my question is not clear. You may only use the information presented in this chat. If the question refers to information that is not present in the chat then you will respond 'I don't know'." This is an attempt to force ChatGPT to only consider the post's text rather than what it might have been trained on. "Act as" prompts have been shown to be effective with GPT. You can find lots of examples to help you think of an effective prompt. It will take some trail and error but you can usually dial in something that works.)

We could come up with a few other techniques. Heck! We could even ask ChatGPT to help us come up with a few more techniques along with their pros and cons. (I highly encourage you to get used to using ChatGPT in this way. If you're not finding yourself reaching for ChatGPT a few times an hour then you're definitely going to fall behind the curve with the new way of life.) But the truth of the matter is that we've already covered about 90% of how Bing and ChatPDF work! Toss the last two techniques into a bowl, add a little seasoning and special sauce, give it a good stir to thoroughly incorporate and you've got yourself the casserole that is 2023!

As with all of these posts, my goal is to pull back the curtain a bit and show you what's going on behind the scenes so that it doesn't feel like magic. If you have any questions or comments please drop us a note!

(1,828 tokens)

Jeff Pickhardt

Bringing AI superpowers to your documents

1 年

Docalysis.com works too. It's newer.

1 次回应

查看更多评论

要查看或添加评论，请登录

Rob Grzywinski的更多文章

From Broad Strokes to Fine Lines: Crafting LLMs into Collaborative Thought Partners

2024年3月14日

From Broad Strokes to Fine Lines: Crafting LLMs into Collaborative Thought Partners

The Promise and Pitfall of LLMs In the rapidly evolving landscape of technology, Large Language Models (LLMs) are at…

14 条评论
Beyond Generic Responses: Crafting Custom Interactions with LLMs

2024年3月12日

Beyond Generic Responses: Crafting Custom Interactions with LLMs

Einstein on the Phone Imagine for a moment the opportunity to consult with the greatest minds in history for advice —…
Redefining Competitive Advantage: Generative AI and the Erosion of Traditional Business Moats

2023年5月5日

Redefining Competitive Advantage: Generative AI and the Erosion of Traditional Business Moats

In the whirlwind of change that is now our daily existence, large language models (LLMs) and generative AI have…

3 条评论
Striking the Right Balance: Privacy and Utility in Large Language Models

2023年4月6日

Striking the Right Balance: Privacy and Utility in Large Language Models

Large Language Models (LLMs) are rapidly transforming the business landscape, fueling applications from customer…

3 条评论
Uncovering the OpenAI Model Development Conundrum: Loopholes and Legalities

2023年3月30日

Uncovering the OpenAI Model Development Conundrum: Loopholes and Legalities

I mentioned in my "Decoding ChatGPT: Who Owns Your AI Conversations and How They're Used" post that OpenAI's "Terms of…
When AI Meets Psychology: How GPT-3.5 Handles False-Belief Tasks and Human Perspectives

2023年3月29日

When AI Meets Psychology: How GPT-3.5 Handles False-Belief Tasks and Human Perspectives

I'm going to ask you to participate in a quick psychological exercise that will motivate the topics of this post…
Too Fast and Too Furious: The Breakneck Advancements of ChatGPT and Its World-Changing Plugins

2023年3月24日

Too Fast and Too Furious: The Breakneck Advancements of ChatGPT and Its World-Changing Plugins

If you remember waaaay back to what seems like months ago, but was in fact only last week, OpenAI released GPT4 and…

2 条评论
Alpaca's Game-Changer: Democratizing AI, Unleashing Innovation, and Redefining the Tech Landscape

2023年3月20日

Alpaca's Game-Changer: Democratizing AI, Unleashing Innovation, and Redefining the Tech Landscape

The week surrounding PI day 2023 will be forever remembered in history as the pivotal moment when AI took flight…
Decoding ChatGPT: Who Owns Your AI Conversations and How They're Used

2023年3月16日

Decoding ChatGPT: Who Owns Your AI Conversations and How They're Used

I get questions from time to time from folks about whether or not information passed into ChatGPT (or any of the other…

1 条评论
Multimodal

2023年3月12日

Multimodal

We experience the world through our senses. We can see, hear, feel, touch, taste and smell.

3 条评论

See all articles

Chatting with Documents: The Future of Search in 2023

Rob Grzywinski

Working on some seriously cool stuff!

领英推荐

Rob Grzywinski的更多文章

社区洞察

其他会员也浏览了

Transformer Architectures for Dummies - Part 2 (Decoder Only Architectures)

Semantic Kernel: Unlocking the Mysteries of Machine Language Understanding

Thoughts on LaMDA 2

GPT-3 writes like a writer, programs like a programmer, and can be ... dangerous

Revolutionize Talent Acquisition - Stay ahead of the recruiting game with AI-driven tools.

OpenAI Unveils SearchGPT: A New AI-Powered Search Engine

AgoraGPT: an interview of MarxGPT

LongCite - Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Find it fast: How semantic search means better CX

GPT-4 Is Here and It Is Powerful: Here Is All It Encompasses

领英推荐

Rob Grzywinski的更多文章

From Broad Strokes to Fine Lines: Crafting LLMs into Collaborative Thought Partners

Beyond Generic Responses: Crafting Custom Interactions with LLMs

Redefining Competitive Advantage: Generative AI and the Erosion of Traditional Business Moats

Striking the Right Balance: Privacy and Utility in Large Language Models

Uncovering the OpenAI Model Development Conundrum: Loopholes and Legalities

When AI Meets Psychology: How GPT-3.5 Handles False-Belief Tasks and Human Perspectives

Too Fast and Too Furious: The Breakneck Advancements of ChatGPT and Its World-Changing Plugins

Alpaca's Game-Changer: Democratizing AI, Unleashing Innovation, and Redefining the Tech Landscape

Decoding ChatGPT: Who Owns Your AI Conversations and How They're Used

Multimodal

社区洞察

其他会员也浏览了

Transformer Architectures for Dummies - Part 2 (Decoder Only Architectures)

Semantic Kernel: Unlocking the Mysteries of Machine Language Understanding

Thoughts on LaMDA 2

GPT-3 writes like a writer, programs like a programmer, and can be ... dangerous

Revolutionize Talent Acquisition - Stay ahead of the recruiting game with AI-driven tools.

OpenAI Unveils SearchGPT: A New AI-Powered Search Engine

AgoraGPT: an interview of MarxGPT

LongCite - Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Find it fast: How semantic search means better CX

GPT-4 Is Here and It Is Powerful: Here Is All It Encompasses