The System Testing Of AI
Photo by BoliviaInteligente on Unsplash

The System Testing Of AI

When we test systems, we don't stop with just testing of functionality of modules, or integration testing of the modules when they interact, but we also test how the system behaves as a whole when put in real-world situations. This comes under several names - system testing, solution testing, interoperability testing, and so on.

We can draw a parallel between this and how we look an AI system as a whole. An AI system is not just the model, not just its data, but has to be taken as a whole and tested on its interactions with other modules in the system, and the users of the system. It's important that we consider how the system would behave during real-time interactions, and there could be several types of testing on that - fault-tolerance, hallucination, performance, security, and so on.

NIST has started a program called ARIA to evaluate AI systems through validation and verification. It would be interesting to check out the scope of this program, and how the testing is being organized within the program, because it would be a great example for other countries to follow suit and implement similar programs in their purview.

As a testing and quality professional, I am quite curious and interested in watching the proceedings. I would also be interested in tracking how this affects other areas like software bill of materials. Real world testing and feedback would make SBOMs implement elements that are important from a system-testing perspective. So far, to start with at least, we are only looking at the machine learning models and the data. We need to start looking at AI system parameters as well.

If you are interested in having a conversation about how to look at things from a system perspective for the AI systems for your organisation, or even to have a casual chat on the topic, feel free to setup a time with me. Glad to discuss.

We are living in interesting times, for sure!



要查看或添加评论,请登录

Venkat Ramakrishnan的更多文章

  • How To Test Last Minute Features

    How To Test Last Minute Features

    We have all been through situations where we are asked to do quality analysis and testing last minute features. In the…

  • On RAGs and Riches

    On RAGs and Riches

    Back in 2018, when I did a talk at ThoughtWorks on NLP, there was an euphoria on the state of chatbots. There was even…

  • A bit about hallucinations

    A bit about hallucinations

    While LLMs are hot, their hallucinations are stark. For a casual user of the LLMs, they might seem to be minor mistakes…

  • At Wit's End On LLM performance?

    At Wit's End On LLM performance?

    Nowadays LLMs' performance is a daily topic! Me, like you, go awestruck looking at those magical numbers when an…

  • The Curious Case Of Software Naming

    The Curious Case Of Software Naming

    You all call me 'Venkat', and I'm okay with that! To be honest, there are boatloads of 'Venkat Ramakrishnan's out…

  • Prevention Is Better Than Cure

    Prevention Is Better Than Cure

    These past forty-five days or so saw the rise of voices of cybersecurity professionals from various capacities towards…

    2 条评论
  • Do Trillions Of Parameters Help In LLM Effectiveness?

    Do Trillions Of Parameters Help In LLM Effectiveness?

    "The more, the merrier" - A great saying to reflect on while organizing a party. Does the same apply for the number of…

    6 条评论
  • Integration Nightmare: The Case Of Super-flexible e-commerce platforms

    Integration Nightmare: The Case Of Super-flexible e-commerce platforms

    Freedom comes at a cost, which is not devoting ourselves to what we know well and accustomed to. This is especially…

  • Rocket Science: An Emerging Quality and Testing Opportunity

    Rocket Science: An Emerging Quality and Testing Opportunity

    A few months back, I had attended a startup enclave in Bengaluru in which I met a variety of entrepreneurs, some…

  • Verify, Then Trust

    Verify, Then Trust

    These are strange times that we live in wherein we cannot trust implicitly without verifying. There were times when we…

社区洞察

其他会员也浏览了