Chatting with Documents: The Future of Search in 2023
Photo Credit: Rob Grzywinski

Chatting with Documents: The Future of Search in 2023

The big thing that you're going to see in 2023 (other than the advancement of Large Language Models (LLMs) in general) is the ability to chat with documents. Bing's new ChatGPT-powered search is a great example of this. Microsoft has extended ChatGPT's "memory" with all of the documents that Bing has crawled across the internet. If you work with a lot of research PDF's like I do, ChatPDF is invaluable. ChatPDF allows you to upload a PDF and then ask questions of it via ChatGPT. The Bing extension available in Edge combines the features of both ChatGPT + Bing as well as ChatPDF. (Personal aside: I prefer ChatPDF over Edge + Bing + PDF in this case as Bing may also try to search the internet when I want it to only focus on a single document. This interface will absolutely get better over time but for now but in this case ChatPDF as a unitasker is perfect for me!)

(You can try out ChatPDF on this post if you'd like! Do a "Print ..." "to PDF" on this page and upload into ChatPDF. Your milage may vary simply because LinkedIn does not print well to PDF. ChatPDF works quite well with research PDFs.)

No alt text provided for this image
Example use of ChatPDF on this post

Let's dig into how Bing or ChatPDF might work. Remember that ChatGPT only knows what you tell it or what it has been trained on (which cuts off around September 2021). As I covered in my "In-Context Learning" post, ChatGPT has no memory per se. Everything you might want to include in your conversation with ChatGPT must fit within its context window. So how big is this context window? It depends on which model you're using. GPT-3.5 is 4k tokens (around 3k words or 6 pages of text), GPT-4 is currently 8k tokens (around 6k words or 13 pages of text) and OpenAI is working on a 32k token (around 24k words or 50 pages of text!!!) version. (I've started to include the token counts at the bottom of each of my posts so you start to get a feel for token lengths. A page of text is around 500 words single-spaced.)

So is that it? Are we simply relying on OpenAI to increase the token count so we can jam our documents into ChatGPT and reach document-chat nirvana? Nope! While many documents will comfortably fit within the current context windows (including the up-coming 8k token window), remember that everything that you want ChatGPT to know must fit within the window. That means both the document and any chat that you have. If you have a long conversation, large documents, or if you want to include multiple documents then larger context windows simply aren't going to cut it.

Let's take a concrete problem so we can walk through it: we want to take the dozen or so posts that I've written so far (as of mid-March 2023) and ask questions of them. For example, we remember something about how ChatGPT "remembers" conversations from the posts but we don't remember what it was. So we want to ask ChatGPT (in the context of the posts) "How does ChatGPT remember what I said in a conversation?".

How might we do this with the tools we have today?

  • The easiest way would be to take the text from all of the posts and combine them into one long string of text and give it to ChatGPT as the first turn of the conversation. Unfortunately, all of the posts together combine to around 10k tokens total and it won't fit in either ChatGPT-3.5 or the current 8k token ChatGPT-4.
  • We could give ChatGPT each post one by one, ask it the question and if it doesn't know then we clear the chat and try with the next document. This approach might work well enough for the first question but if we want to do follow-ups, especially if those follow-ups refer to other posts, then we're going to be doing a lot of cutting and pasting. Not infeasible but if the document doesn't fit or if the conversation is too long then we're out of luck.
  • We could start to get fancy and first give each post to ChatGPT to summarize and then combine those summaries together for the context. This isn't a bad approach in general but depending on how important the information is within the post, the summary may not contain the desired information. In our case, when I pass the contents of "In-Context Learning" through ChatGPT to summarize, I get the following: "The post discusses Large Language Models (LLMs) and how they are trained to predict the next word in a sentence by learning from a large corpus of text. The post also explores how LLMs generalize and learn to remember new phrases. The post explains that while LLMs can be used in conversations, they do not have a working memory, and their responses are solely based on the conversation at hand. The post also discusses how LLMs are not entirely understood, and their inner workings cannot be controlled. The post concludes by discussing recent developments in in-context learning, which allows LLMs to rewire themselves at runtime to better answer questions." It doesn't look like this would be able to answer our question. We could spend the time tweaking the summary prompt to get it to provide better fidelity but it's always possible that the relevant content isn't present in the summary. (Of course if the post doesn't fit in the context window then we have to get really really fancy and divide the post up in the chunks, feed those into the ChatGPT to summarize and then summarize and/or combine the summaries.)
  • Assuming that the posts have been indexed by Google, we could enter our question into Google and limit the search results to only the posts. (Angry aside: it seems that Google's indexing of LinkedIn posts is spotty at best! I honestly did not know that until literally this moment. It's 2023 for pity sake! Grrr!) We could then take the results of that search and put them into ChatGPT and have it generate a response as a complete sentence. If we don't get any results from Google based on our exact question, we could ask ChatGPT to rephrase our question and submit that. This approach has pros and cons associated with it. A big warm fuzzy pro is that we're gotten really really good at search over the past two decades! Another pro is that in addition to the sentence(s) that you've searched for, search can give you text around the result as context. This context is great for your Q&A with ChatGPT. But this context can also be a con. Depending on where the context is cut-off, you might miss something important. Let's say that the text around the relevant sentences is ""But when I use ChatGPT it remembers what I said earlier in the conversation!" Everything that it knows and learns about your conversation is solely contained within that conversation." If you look back at the original post the first sentence refers to something that isn't true but you wouldn't know that by looking only at a narrow window. But if that sentence is included (without context) into your ChatGPT conversation then ChatGPT can easily get confused!

(Prompting aside: To get ChatGPT to answer my questions, I would use a prompt along the lines of "I want you to act as a question and answer chatbot. I will ask the questions and you will provide the answers. You may ask only clarifying questions if my question is not clear. You may only use the information presented in this chat. If the question refers to information that is not present in the chat then you will respond 'I don't know'." This is an attempt to force ChatGPT to only consider the post's text rather than what it might have been trained on. "Act as" prompts have been shown to be effective with GPT. You can find lots of examples to help you think of an effective prompt. It will take some trail and error but you can usually dial in something that works.)

We could come up with a few other techniques. Heck! We could even ask ChatGPT to help us come up with a few more techniques along with their pros and cons. (I highly encourage you to get used to using ChatGPT in this way. If you're not finding yourself reaching for ChatGPT a few times an hour then you're definitely going to fall behind the curve with the new way of life.) But the truth of the matter is that we've already covered about 90% of how Bing and ChatPDF work! Toss the last two techniques into a bowl, add a little seasoning and special sauce, give it a good stir to thoroughly incorporate and you've got yourself the casserole that is 2023!

As with all of these posts, my goal is to pull back the curtain a bit and show you what's going on behind the scenes so that it doesn't feel like magic. If you have any questions or comments please drop us a note!

(1,828 tokens)

Jeff Pickhardt

Bringing AI superpowers to your documents

1 年

Docalysis.com works too. It's newer.

要查看或添加评论,请登录

Rob Grzywinski的更多文章

社区洞察

其他会员也浏览了