Runs on Intel: Enhancing LLM Performance with RAG and ReAct
Did you know? New techniques can give AI language models the power to understand and summarize information that wasn’t originally trained into the model.
State-of-the-art large language models are no longer frozen in time, doomed to forever repeat outdated information. These new technologies—Retrieval Augmented Generation (RAG) and ReAct Agents—augment language models with the ability to retrieve personalized and specific information from your own documents. These models and documents both exist entirely on your computer, so the analyses or summaries you generate are both personal and private. ?
Sound interesting? Intel already has these technologies up and running! Let's take a look at how the new capabilities work, using Microsoft's Phi-3 model as an example, then see it in action on an Intel Core Ultra 200V Series processor (codename: Lunar Lake).
Understanding Phi-3, RAG, and ReAct
Phi-3 is a family of small language models developed by Microsoft. The Phi-3 family consists of three flavors up to 14 billion parameters (learn more: what is a parameter?). Even the smallest Phi-3-mini, with just 3.8 billion parameters, is astonishingly good at understanding and reasoning in tasks like coding, mathematics, and logic. Phi-3-mini's size, accuracy, and advanced RAG/ReAct capabilities make it a great fit for offline use.
Retrieval Augmented Generation (RAG) allows an application using Phi-3 to access textual information in your local documents. If the language model identifies that the documents contain information that is more current or relevant than what was originally designed into the model, the model will retrieve that newer information and then augment the reply it generates.
As a living example, a model like Phi-3 could not possibly be trained to know what's inside a PDF you saved last Tuesday. But with a technology like RAG, Phi-3 could easily provide accurate replies about that content. If you expand this idea to the massive volume of information that needs to be accessed and summarized in medicine, law, business, government, or other information professions, you can see how this capability could be mighty powerful.
ReAct Agents go one step further by turning the local document retrieval of RAG into one of many sources of info the model can draw on. Whereas RAG only accesses documents, a ReAct Agent in the model can be configured to access emails, dictionaries, encyclopedias, documents, images, and more. Just like RAG, a ReAct Agent is designed to identify and incorporate external sources for much more precise answers.
All of this begs the question: why do language models need this at all? Hallucinations. Language models can give absurdist replies when queried on a topic they do not know, and this is called a "hallucination." In practice, it doesn't feel all that different from a human trying to fake their way through an answer on a topic they don't understand. By augmenting the pre-trained knowledge of the model with access to newer and personalized information, the odds of these hallucinogenic answers drops tremendously.
As an added benefit, technologies like RAG and ReAct allow models to simultaneously get smaller and more accurate. Rather than training enormous models with gobs of information a majority of users may never touch, the model development can instead focus on excellence in core competencies. As an example, benchmarks for Microsoft's previous 2.7 billion parameter Phi-2 model often outperformed older language models 25x the size of Phi-2. Developments like these reduce the performance requirements of AI accelerators, enabling smaller GPUs and NPUs to handle the same work.
领英推荐
RAG and ReAct in Action
Intel is at the forefront of testing and enabling new AI models on PC hardware. To date, over 500 AI models covering over a dozen disciplines have been tested to run with optimized performance on Intel Core Ultra processors. Phi-3-mini is one of those 500+ models, and it's high time we see it in action!
If you have a new laptop featuring an Intel Core Ultra 200V Series processor, you can also try RAG and ReAct yourself with Intel's AI Playground tool. AI Playground gives you access to language models like Phi-3-mini, high-res image generation, image upscaling, and more. No internet connection is required, it's easy to use, and the tool is completely free!
Laptop vendors like Acer are also taking advantage of Intel R&D by leveraging Phi-3 with RAG in the new AcerSense software, which comes pre-loaded on new notebooks featuring Intel Core Ultra 200V Series processors. Intel's work to enable, optimize, and validate these AI models is very similar to how graphics cards need optimized drivers for the best experience.
As Intel engineering continues to lead the industry in validating and optimizing AI models for offline use, we are clear-eyed about and motivated by a future where AI-based features are widely available in almost every application. And, before long, we foresee that system performance and power consumption will be deeply entangled with AI feature sets, which makes early and enthusiastic enabling work like RAG and ReAct convenient for now and vital as a foundation.
About the Authors
Robert Hallock is the General Manager and Vice President of Client AI & Technical Marketing at Intel. Prior to joining Intel, Robert spent 12 years in Client and Graphics at AMD, most recently as the Director of Product and Technology Marketing for consumer Ryzen processors. He has also been a PC hardware reviewer, journalist, and technical writer. He moonlights as a designer of high-performance aftermarket automotive components and is a lifelong PC enthusiast.
Erin Maiorino is the Director of Competitive AI Marketing at Intel. Prior to joining Intel, Erin was the Senior Product Marketing Manager for AMD Ryzen and Threadripper processors and served as the Director of Content Marketing at Lattice Semiconductor. Before working in PC hardware, Erin was in the gaming industry working on titles like Halo 4 and SWTOR. She is a self-proclaimed dog nerd and loves hiking, nosework and agility with her dog Whiskey.?
?
Footnotes:
Sr. Account Director | Strategy and Account Management #customerfocused #brandtodemand #areyoufromGirlsState #boymom?? #AInerd #intentdata
5 个月"Truly like the future is here!" Such a great statement to wrap that demo up Craig Raymond!
This is fascinating! The advancements in AI with technologies like RAG and ReAct Agents are truly groundbreaking. It's impressive to see how these innovations can enhance the capabilities of language models by integrating personalized and up-to-date information, all while maintaining privacy.
Social Media Manager & Graphic Designer at Qsc solutions
5 个月Insightful