Everybody Likes RAG
AIM Events
Hosting the World’s Most Impactful AI Conferences & Events. For Brand collaborations write to [email protected]
Unless you've been living under a rock, you are sure to have heard of RAG. It's a buzzword in the LLM (large language model) landscape and hard to miss. While some regard it as fancier prompt engineering, others see it as a must-have capability for scaling enterprise applications and products.
The reasons for its popularity are aplenty, and it seems that everyone is eager to explore and utilise RAG in their daily tasks and ventures.?
RAG, or retrieval-augmented generation, stands out for its unique blend of benefits and cost-effectiveness. Its advantages include dynamic knowledge control, access to current and reliable information, transparent source verification, effective information leakage mitigation, domain-specific expertise, and low maintenance costs, etc.
With the launch of GPT-4 Turbo and the Retrieval API, OpenAI has tried to fix the hallucination problem, but forgot the data privacy aspect. With fancier prompt engineering, a user on X shared how he was able to download the original knowledge files from someone else’s GPTs, an app built with the recently released GPT Builder, exactly with RAG. This is a big security issue for this model.?
Similarly, Google Bard also faced a similar prompt injection problem, where a hacker was able to exfiltrate files such as Docs, Drive, and YouTube history, from the chatbot that other users have uploaded. Even Google’s Bard is not foolproof.
A lot of users recently took to Reddit to discuss if LangChain’s RAG offering was better than using OpenAI’s models, and a lot of people favoured the former. Currently, GPT Builder has a 20-file limit on its platform for building a single GPT, which makes it less desirable for serious developers and enterprise use cases.?
That is why it becomes important for you to know the limitations and have an understanding of some of these tools and APIs to build your LLM applications – i.e. to know when to, and when not to use RAG.?
Making LLM Reason Better?
Microsoft recently released an approach to make LLM reason better. It’s called ‘Everything of Thought’ (XOT). This new methodology draws inspiration from Google DeepMind’s AlphaZero, which uses tiny neural nets that can perform better with larger ones.
领英推荐
Developed in collaboration with Georgia Institute of Technology, and East China Normal University, this new model uses a blend of reinforcement learning and Monte Carlo Tree Search (MCTS) — techniques renowned for their effectiveness in complex decision-making. The outcome? Read to find out.
In Oracle Governments Trust?
In August, the Ministry of Education chose Oracle Cloud Infrastructure (OCI) to overhaul India's national education technology platform, DIKSHA. This decision mirrors other major OCI adoptions, such as the Credit Guarantee Fund Trust for Micro and Small Enterprises (CGTMSE) in 2021, which has since aided around 12 million borrowers, including 21% women-led entrepreneurs and 92% first-time borrowers.?
Oracle Cloud was also selected by the Bangladesh government to host all its cloud services in the same year. The US government and military have been among OCI's earliest customers. One wonders why governments across the world trust Oracle so much. Chris Chelliah, senior vice president of technology and customer strategy Oracle JAPAC, unveils the company’s DNA and core principles here.?
Generative AI Shakes Up Hyperscalers?
In a bid to adapt to the demands of generative AI, all major hyperscalers, including Microsoft, AWS and Google Cloud, are busy making changes to their existing infrastructure. At OpenAI DevDay, Microsoft chief Satya Nadella recalled how OpenAI made them change their infrastructure completely, starting from power to the DC (data centre), to the rack, the accelerators and the network.
Earlier this year, Google Cloud alsoannouncedthat they are working to integrate AI infrastructure more extensively into their overall fleets. Similarly, AWSstatedthat the company plans to deploy multiple AI-optimised server clusters over the next 12 months. But, why?Oracle seems to have answers to this question.