GenAI gets really useful P1: Analyze many documents at once
Most of the news on large language models (LLMs) has been on the number of parameters, the number of dimensions in the model, which corresponds to how nuanced their understanding of language is. However, for my money equally important is the growth in the size of the context window, the amount of text you can feed to the model and have it hold in memory. Big context windows open up the ability to have the models do analyses across documents, comparing and contrasting them. My friends in the market research industry are figuring out how to automate analysis of a series of in-depth interviews, but the approach is equally applicable to reviewing scientific papers, legal opinions, and on and on.
How it works
Getting computers to deal with the meaning of human language starts with translating words and sentences into a series of tokens. In phonetic languages a token is more than a single letter but less than a syllable. A typical English word is 3 tokens and a single-spaced page of type is about 666. Note that these are rough averages as a page in a pictogram language is more tokens than a phonetic language and technical writing is more dense than non-technical.
GPT-4 and its competitors can now hold from 128,000 to 2 million tokens in memory at one time. 128,000 tokens is roughly 170 single-spaced pages of text and the King James Bible is a bit more than 1 million. ?This means you can feed GPT-4 multiple 10-Qs and ask it to compare them, ditto all of the essays from a seminar class, or 20 one-hour interviews, and on and on. There is talk that soon models will have functionally infinite context windows, meaning analysis across whole books and even libraries will be possible, though likely too slow and expensive for all but power users.
Bringing documents into the context windows is different from RAG, which I described here. RAG is great for getting the model to search over a big corpus of data or even a large chunk of the internet to find the few pieces it needs to answer a question. When you put whole documents into the context windows the model has access to the full text and can extract themes, compare and contrast across documents, in short do the work that a human analyst.
GPT-4 in action: Analyze a series of interviews
GenAI is disrupting the market research industry in many ways but one I often face is how to efficiently analyze a series of in-depth interviews. ?This week I got to chat with Scott Swigart, SVP, Technology Group & AI Innovation, Shapiro+Raj, a market research firm that I’ve engaged to interviews IT Pros and other esoteric audiences. ?Scott has been leading their push to use GenAI and shared with me what he’s learned.
Model prompts are critical IP
As professional service firms learn to incorporate GenAI into their processes their prompts, the instructions they give the models, are becoming critical IP. Like others, S+R has built a web portal to give their analysts access to secure versions of advanced models and to store their recommended prompts. They are experimenting and refining the prompts to be more efficient and effective. Prompts are quickly becoming a be a big part of consultancies’ secret sauce.
AI transcription
Depending on their client Shapiro+Raj conducts interviews on either Teams or Zoom and gets the AI generated transcript. For corporate users of Teams, the transcription algorithm is somewhat attuned to their acronyms and jargon, raising the quality of the transcript and avoiding the need for manual cleaning. The more specialized the language the harder it is for standard transcription models to capture it accurately. It’s likely that analysts in jargon-heavy fields such as healthcare will have their video calling platform simply record an interview and do the transcription offline with a transcription model specially trained for that field. ??
领英推荐
The Shrinker
For a project with 20 one-hour interviews the raw transcripts will exceed the largest available context window. So, Scott has written a compression prompt they call The Shrinker. An analyst uploads a raw transcript to the portal and uses The Shrinker to tell the model to take out non-words, delete the time stamps, and format the interview into a succinct Q&A format while preserving the respondent’s actual words. With this, a 20+ page raw transcript shrinks to only a few pages. The Shrinker also labels each transcript with a coded identifier for the respondent and removes personally identifying information. Like other firms, Shapiro+Raj has corporate agreements with makers of the major LLMs to keep their client data confidential, but they still don’t feed any personally identifying information about respondents or clients into the models.
Theoretically the analyst has the choice of translating all transcripts to a single language or having the model do analyses in multiple languages, though I haven’t seen that in action yet.
Along with the transcripts the analyst can upload supporting documents such as the project objectives and the interview guide. Shapiro+Raj also has specialized prompts that can convert charts (or a whole report) to text that the LLM can understand and process. However, as things stand today GenAI models don’t do math well so don’t ask for calculations. Math is one of the new functions coming with GenAI Agents, which will be the subject of my next post.
GenAI powered anaylsis
When the transcripts and other documents ready for upload the analyst goes to the prompt library for pre-built prompts that tell the model the context (eg: You are analyzing transcripts from phone interviews with IT Pros. All of your answers should be supported by quotes with each quote attributed to a specific respondent.) and questions for it to answer. Some typical ones include: ?
·?????? Pull out the most significant themes across all of the interviews. What are the benefits sought and barriers to success for each of them?
·?????? How are answers from <subgroup 1> vs <subgroup 2> different? How are they the same? Extract quotes to illustrate both.
·?????? Extract the critical jobs to be done for X process.
When the analyst hits go, the web portal uploads the transcripts and the question to the model and in a minute or so returns an answer. Scott has even included instructions for the model to suggest follow up questions.
Since the model doesn’t have memory each follow up question requires the model reread all of the documents, but this only takes moments and a few dollars.
Scott reports that the benefit of GenAI analysis has been more about better than cheaper or faster. He is observing analysts using the system to have a rich back and forth with the model, going deeper into the data than would be feasible with regular transcripts and a deadline looming. They are finding the model to be a more than competent research assistant when a human asks it the right questions in the right ways.
Next newsletter: AI gets really useful P2: Here come agents