登录查看更多内容

Make your AI multi-modal ML trustworthy

George Roth

发布日期: 2024年7月20日

Every study indicates that the most significant challenge in implementing AI models within enterprises stems from the fact that AI-powered systems lack awareness of trustworthiness. Consider this: when ChatGPT provides an answer, wouldn’t it be helpful if any hallucinatory portions were highlighted in red on the screen? Similarly, imagine an autonomous driving system encountering an unprecedented situation through computer vision during training—shouldn’t it trigger an emergency stop? And when a banking AI denies a loan to a customer, it should display its confidence level and explain the decision. This concept is known as explainability.

One of the primary causes of these issues lies in training data flaws. But what if, during model training, AI took the lead, with humans assisting when necessary? Imagine a collaborative approach where AI and human intervention complement each other.

Interestingly, a friend introduced me to a technology that addresses precisely these challenges. You can explore and try it for free! Visit the CAPSA product on themisai.io website.

On the website, you can explore a comprehensive demo designed for engineers. It illustrates how simple it is to address these challenges using the CAPSA library, which seamlessly integrates with any multimodal ML model—whether generative or not. Remarkably, this groundbreaking software originated in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) , under the guidance of CSAIL director Daniela Rus .

The core concept behind solving this intricate problem lies in allowing CAPSA to analyze your model and replace each neuron with a specialized one. And the best part? Achieving this transformation requires just one line of code:

model = capsa_torch.wrapper(_model).

These neurons serve two critical roles in enhancing existing models:

Trustworthiness Awareness:
Optimizing Re-Training:

Main Advantages of CAPSA (according to the CAPSA site):

Risk Mitigation:CAPSA automates accurate uncertainty quantification for any machine learning model, minimizing risk.
Cost Savings:By reducing compute costs required for training and de-risking AI models by 89%, CAPSA offers substantial savings.
Expanded Product Range:Enterprises can confidently offer a broader range of AI-powered products without compromising trust or causing harm.

In light of the new AI regulations emerging worldwide (including the recent EU legislation), incorporating solutions like CAPSA becomes essential for responsible AI deployment.

And to illustrate CAPSA’s capabilities, let me share three intriguing use cases. I’ve extracted images from the CAPSA YouTube demo on their site: CAPSA Demo.

Use Case 1 - Handwriting Recognition (Text Models)

This use case utilized the MNIST benchmark database for handwriting recognition. An ML model was used for real test data, resulting in an accuracy of 65%. (out of the box)

The text shows for each character the accuracy indicator. With red you will see the errorrs, i.e. the system didn't do well.

After inserting the CAPSA Wrapper, all the errors are highlighted:

After retraining with the wrapped model, a result of 90% Accuracy is obtained.

Use Case 2: Pixel depth recognition (Image Models)

In autonomous driving, one of the core challenges is accurately determining the pixel depth of all objects within an image. This task is crucial because, during driving, the system must avoid collisions with nearby obstacles.

For instance, consider a scenario where a bus or a deer suddenly appears in an image. If the autonomous driving system was not adequately trained on similar images or if manual labeling was incomplete, confusion may arise, potentially leading to accidents. CAPSA addresses this issue effectively. In the images below, CAPSA highlights problematic areas by coloring pixels in red:

领英推荐

(How) do you bias?

Igor Grubisic 1 年前

Explainable AI (XAI): Cracking the Black Box

Nabil EL MAHYAOUI 6 个月前

AI Needs Big Data

Irene Aldridge 1 年前

From the CAPSA Demo - the bus was not detected - CAPSA highlights the pixels in RED

CAPSA’s ability to identify critical regions ensures safer and more reliable autonomous driving systems.

From the CAPSA demo - the deer was not detected - CAPSA highlights the pixels in red color

Use Case 3 - Generative AI responses - hallucination detection

The third use case involves highlighting hallucinated responses in Generative AI Language Models (LLMs).

Below, you’ll find examples where CAPSA’s trustworthiness indicators color the hallucinated text in red. These questions are part of a benchmark set used for evaluating LLMs.

From CAPSA Demo - A response with only one accidental hallucination for a token (in)

A nonsensical question with hallucinated responses highlighted in red tokens

Conclusion

We are on the path toward being able to enforce trustworthiness in AI models through human-in-the-loop AI. CAPSA represents both a vision for the future and an existing reality. You can explore this remarkable technology by visiting ThemisAI’s contact page.

Liviu Virgil Olos

AI Adoption Trainer for B2B / B2G, Founder@Loftrek: Ethical AI Urban Products Distribution Company; Founder@Hotel Marketing Solutions; Values: Integrity, Innovation, Impact

8 个月

Very interesting!

Chris Colwell

Vice President of Solution Engineering @ Flock Safety

8 个月

Thanks for sharing, George Roth! I fully believe that the more transparency on the confidence levels of the model helps us make better decisions of how/when to have human in the loop interjection. Sounds like a great technology!

1 次回应

Sanjeev Aggarwal

Director at Hanabi Technologies

8 个月

Hey George Roth You should definitely try Hana. She's more than just an ordinary AI bot—she's an assistant team member who can customize everything for you and function just like a real team member or an assistant. Check out this video to know more about Hana: https://youtu.be/KdUQsuM2XI4?feature=shared

查看更多评论

要查看或添加评论，请登录

George Roth的更多文章

My notes from the Jensen Huang keynote

2025年3月18日

My notes from the Jensen Huang keynote

The NVIDIA Jensen Huang keynote was both interesting and entertaining. I heard that they call the NVIDIA GTC the…

2 条评论
Statement on Romania’s Presidential Elections Press Statement

2024年12月5日

Statement on Romania’s Presidential Elections Press Statement

Statement on Romania’s Presidential Elections - United States Department of State We have been closely following the…
The Beginings of IT in Romania, AI and entrepreneurship

2024年11月23日

The Beginings of IT in Romania, AI and entrepreneurship

At the AI Conference in Bucharest, I gave an interview to the excellent journalist Kolbay Gabriel from spotmedia.ro .

5 条评论
A negative positive experience

2024年6月13日

A negative positive experience

Last night, I arrived at Seattle Airport with a one-hour delay from San Jose. It was 11:30 PM.

2 条评论
2023 AI Predictions check and predictions for 2024

2024年1月3日

2023 AI Predictions check and predictions for 2024

I published my predictions in an article in January 2023. (see here ).

3 条评论
A Remarkable Journey to Cluj: Navigating Challenges with a Dash of Romanian Hospitality

2023年12月3日

A Remarkable Journey to Cluj: Navigating Challenges with a Dash of Romanian Hospitality

Yesterday was a bit of a wild ride for me, going from Washington DC to Cluj, Romania. Let me spill the tea on this…

8 条评论
The most intriguing Generative AI course that you can take

2023年9月18日

The most intriguing Generative AI course that you can take

I must admit that when I first came across the title of Andrew Ng and John Maeda Maeda's course, "How Business Thinkers…

4 条评论
Generative AI in Ocean Management Workshop

2023年8月8日

Generative AI in Ocean Management Workshop

Today I had the chance to participate to an event organized by Professor Yuwei Shi, during the Blue Pioneers…

6 条评论
Historical Milestone: The First Responsible and Trustworthy AI Legislation was approved

2023年6月14日

Historical Milestone: The First Responsible and Trustworthy AI Legislation was approved

The European Parliament has approved a draft of the EU's AI Act, taking a major step toward what could be the first…

1 条评论
Cut Cost and Transform Customer Experience in Banking

2023年5月30日

Cut Cost and Transform Customer Experience in Banking

The Banking Industry is the main user of AI powered Automation and Generative AI. This was a great webinar that dealt…

See all articles

Make your AI multi-modal ML trustworthy

George Roth

Use Case 1 - Handwriting Recognition (Text Models)

Use Case 2: Pixel depth recognition (Image Models)

领英推荐

Use Case 3 - Generative AI responses - hallucination detection

Conclusion

George Roth的更多文章

社区洞察

其他会员也浏览了

Manus: The Next DeepSeek Moment?

AI and me.

Here's what happened when a robot debated with a human about the potential dangers of artificial intelligence

Mastering the Mindset of Tomorrow: Navigating the AI Landscape for Professionals

Advanced Techniques in Leveraging Autoencoders and K-Fold Cross-Validation for Bias Detection and Fairness in AI

Why AI can feel Intimidating?

The Future of AI & ML: Transforming the World Ahead

AI, GenAI, ML and more - What are they and their application?

Is DeepSeek-R1 the Future of AI? Implications for Cybersecurity

Is it time to pause AI and ML development?

Use Case 1 - Handwriting Recognition (Text Models)

Use Case 2: Pixel depth recognition (Image Models)

领英推荐

Use Case 3 - Generative AI responses - hallucination detection

Conclusion

George Roth的更多文章

My notes from the Jensen Huang keynote

Statement on Romania’s Presidential Elections Press Statement

The Beginings of IT in Romania, AI and entrepreneurship

A negative positive experience

2023 AI Predictions check and predictions for 2024

A Remarkable Journey to Cluj: Navigating Challenges with a Dash of Romanian Hospitality

The most intriguing Generative AI course that you can take

Generative AI in Ocean Management Workshop

Historical Milestone: The First Responsible and Trustworthy AI Legislation was approved

Cut Cost and Transform Customer Experience in Banking

社区洞察

其他会员也浏览了

Manus: The Next DeepSeek Moment?

AI and me.

Here's what happened when a robot debated with a human about the potential dangers of artificial intelligence

Mastering the Mindset of Tomorrow: Navigating the AI Landscape for Professionals

Advanced Techniques in Leveraging Autoencoders and K-Fold Cross-Validation for Bias Detection and Fairness in AI

Why AI can feel Intimidating?

The Future of AI & ML: Transforming the World Ahead

AI, GenAI, ML and more - What are they and their application?

Is DeepSeek-R1 the Future of AI? Implications for Cybersecurity

Is it time to pause AI and ML development?