On RAGs and Riches

On RAGs and Riches

Back in 2018, when I did a talk at ThoughtWorks on NLP, there was an euphoria on the state of chatbots. There was even hype, with every college graduate I met saying that they work on chatbots, just like they did with Java in the late 1990s. In that talk, I recommended NOT to go for chatbots development as the technology had not matured, and even with BERT et al, there were basic errors. Those, compounded with highly vernacular provinces in India, where people when allowed to type, mixed Hindi with English, making it hard for the bots to process text. I recommended menu-driven selections (like in business rules) rather than AI-driven chatbots.

Now in 2024, we are looking at LLMs which can fluently generate text. Their ability to chat has significantly increased, with Transformer decoders like GPT. Although they suffer from issues like hallucinations, they provide a mostly satisfying experience as a chat bot, a companion to discuss things, and democratizing general knowledge. C'mon, you got to give to them; you can just launch Google Gemini and chat any topic of your choice!

But testers are a tough creed (as Jason Arbon puts it). We do a 'gradient-descent' towards truth and accuracy (beware, it could turn out looking like fault-finding!), and in that process, we find that many use GenAI tools like GPT for purposes that they are not meant for. My friend Adam Shostock, a pioneer in threat modeling, as a way of fun, counted the number of 'b's in the word 'blackberry' with chatGPT, and look what he got!

Here comes RAGs, as a solution to the inaccuracies and hallucinations, as suggested by Patrick Debois, a pioneer in DevOps, in a recent podcast with me for my Software Testing and Quality Talks YouTube channel (to be released soon). RAGs augment and makes the chatbots' information performance better, he says. It makes sense to me, as I am of the opinion that GPTs' usage should be restricted only to text, image generation, and probably text summarization and machine translation, and should not be used for other problems. Even if you do, you need RAGs and other data augmentation techniques.

The issue with RAGs and data augmentation techniques is that they are performance-intensive. Many times, an LLM need to make a call on whether to use its 'intrinsic' knowledge or to use RAG, and the process takes performance cycles and energy. In a world that's keen about sustainability and hard on energy-intensive computation, that's a no-no.

Even if LLMs do not use RAGs, they might need to call other apps which can do the computation for them, wherein the performance and software dependencies from a security perspective need to be looked at.

That leaves us with the question whether the LLMs are worth the trouble. That, is a very complex question, and would put the stuff back in a consultant's purview, and they would say (like me), 'that depends!'.

Choose your armors, battles, and wars. As they say,

“If the only tool you have is a hammer, you tend to see every problem as a nail.”

If you would like to explore a comprehensive quality strategy for your LLM implementation, please feel free to get in touch with me.


要查看或添加评论,请登录

Venkat Ramakrishnan的更多文章

  • How To Test Last Minute Features

    How To Test Last Minute Features

    We have all been through situations where we are asked to do quality analysis and testing last minute features. In the…

  • The System Testing Of AI

    The System Testing Of AI

    When we test systems, we don't stop with just testing of functionality of modules, or integration testing of the…

  • A bit about hallucinations

    A bit about hallucinations

    While LLMs are hot, their hallucinations are stark. For a casual user of the LLMs, they might seem to be minor mistakes…

  • At Wit's End On LLM performance?

    At Wit's End On LLM performance?

    Nowadays LLMs' performance is a daily topic! Me, like you, go awestruck looking at those magical numbers when an…

  • The Curious Case Of Software Naming

    The Curious Case Of Software Naming

    You all call me 'Venkat', and I'm okay with that! To be honest, there are boatloads of 'Venkat Ramakrishnan's out…

  • Prevention Is Better Than Cure

    Prevention Is Better Than Cure

    These past forty-five days or so saw the rise of voices of cybersecurity professionals from various capacities towards…

    2 条评论
  • Do Trillions Of Parameters Help In LLM Effectiveness?

    Do Trillions Of Parameters Help In LLM Effectiveness?

    "The more, the merrier" - A great saying to reflect on while organizing a party. Does the same apply for the number of…

    6 条评论
  • Integration Nightmare: The Case Of Super-flexible e-commerce platforms

    Integration Nightmare: The Case Of Super-flexible e-commerce platforms

    Freedom comes at a cost, which is not devoting ourselves to what we know well and accustomed to. This is especially…

  • Rocket Science: An Emerging Quality and Testing Opportunity

    Rocket Science: An Emerging Quality and Testing Opportunity

    A few months back, I had attended a startup enclave in Bengaluru in which I met a variety of entrepreneurs, some…

  • Verify, Then Trust

    Verify, Then Trust

    These are strange times that we live in wherein we cannot trust implicitly without verifying. There were times when we…

社区洞察

其他会员也浏览了