Researchers have ranked AI models by risk. It was one big mess.
Marco van Hurne
Partnering with the most innovative AI and RPA platforms to optimize back office processes, automate manual tasks, improve customer service, save money, and grow profits.
Because everybody and their mothers are developing AI models nowadays, a couple of researchers have started ranking AI models based on their potential risks, and the results were pretty much.... "eye-opening" for me.
It turns out that not all AI is created equal when it comes to safety and ethical behavior.
Who would have thought?
Before we start!
If you like this topic and you want to support me:
We have been hit by an AI tidal wave.
Literally.
For an article about AI-Agents I tried plotting all known AI-Agent platforms. After I reached 400 platforms, I decided to call it a day.
This is just benoculars.
And because AI is getting more and more integrated into our daily lives, a lot of people are starting to worry all of a sudden about the potential risks. I mean, we're talking about powerful systems that could potentially misbehave or cause harm if not properly regulated.
That's where experts like Bo Li come in (cool name!).
She is an associate professor at the University of Chicago who specializes in stress testing AI models to uncover vulnerabilities and potential misbehavior. She has become the go-to expert for consulting firms who are more concerned about how problematic AI models can be than how smart they are.
And that is a good thing !
Not only look at the midichlorians, but also take into account, the dark side of the force.
And since AI is powering more core applications than ever, it's not just about intelligence anymore; it's about ensuring AI behaves within acceptable boundaries
To tackle this issue head-on, Li and her colleagues from various universities and organizations developed a comprehensive taxonomy of AI risks.
They've categorized the ways AI models can go rogue and cause trouble.
I am talking about cybersecurity threats, misinformation, privacy violations, the whole shebang. They also created a benchmark called AIR-Bench 2024 to measure how different AI models perform when it comes to avoiding rule-breaking behavior.
Basically It's a report card for AI safety.
To build AIR-Bench 2024, the researchers went all out.
They analyzed government AI regulations and guidelines from the US, China, and the EU. They even reviewed the usage policies of 16 major AI companies worldwide, including the bigguns like Google, Microsoft, and OpenAI. The goal was to create a benchmark that reflects a wide range of regulatory and ethical standards, so we can really put AI to the test.
And wowawiewa, did they find some interesting results!
Different AI models handle risky scenarios very differently.
For example, Anthropic's Claude 3 Opus is a champ at refusing to generate cybersecurity threats, which is crucial in today's world of sophisticated cyber-attacks. On the other hand, Google's Gemini 1.5 Pro is particularly good at avoiding the generation of nonconsensual sexual content, which is a major concern for online safety. And then there's the DBRX Instruct model by Databricks, which scored the worst across various risk categories.
Ouch.
These findings have big implications for businesses and policymakers.
Companies looking to deploy AI need to understand the risk landscape and the strengths and weaknesses of specific AI models. For example, if a company wants to use an AI for customer service, they might prioritize a model's ability to avoid offensive language over its technical capabilities. It's all about finding the right fit for the job.
It also revealed interesting trends in AI development and regulation.
Tt turned out that government regulations are often less comprehensive than the policies set by companies themselves.
领英推荐
Hahahahahaha.... sorry, people. Just could not help myself.
This suggests that there's room for stricter regulations to keep AI in check.
The researchers also found that some AI models don't fully comply with their own company's policies, which means there's still a lot of work to be done to ensure model safety.
Other researchers are also working hard to catalog AI risks.
Just this week, two researchers at MIT unveiled a new database of AI dangers, compiled from 43 different AI risk frameworks. Neil Thompson, a research scientist at MIT, noted that many organizations are still in the early stages of adopting AI and need guidance on potential risks. In the end, this will definitely become a roadmap for navigating the AI minefield.
Improvements in capability outpaces advancements in safety.
One of the most concerning findings from Li's research is the apparent disconnect between the increasing capabilities of AI models and their safety. For instance, her company recently analyzed the largest and most powerful version of Meta's Llama 3.1 model. While the model was found to be more capable in many respects, it was not significantly safer than previous versions. This to me is a symptom of a broader trend in AI development, where improvements in capability outpaces advancements in safety. Think of it like building a faster car without upgrading the brakes.
This disconnect is particularly troubling given the potential for AI to be used in high-stakes environments, such as healthcare, or finance, and ummm... national security. In these contexts, even a small safety lapse can have catastrophic consequences. Li and her colleagues urge model developers to pay more attention to ensuring that AI safety improves in tandem with capability, rather than lagging behind.
I think the message that she and her colleagues is trying to convey, is that when companies are looking to develop models, or to deploy AI, they need to understand the risks associated with AI. Because this is just as important as understanding its capabilities.
And by doing so, they can be sure that they are using AI in a way that is not only doing what it is supposed to do, but also that it is safe to use (*that it will not lead to the doom of your company*), that it is ethical, and compliant with the law.
Let me wrap up this article with a heartfelt plea to all CEOs: Please, take a moment to think before you let AI run amok. After all, the last thing we need is an AI that suggests pineapple on pizza
Hey CEOs,
AI is moving fast, and we need to keep up. Ok, we know that. And the board is screeming for it too. We are with you. But it's not just about jumping on the bandwagon; it's about doing it right.
So, here are four key things you need to keep in mind:
1. Get your data in order. A solid data strategy, governance, and platform are the foundation of any successful AI initiative. Without it, even the fanciest AI models will fall flat.
2. Know the risks. AI is exciting, but it also comes with some serious risks. Don't make any big moves until you've thoroughly assessed and understood these risks.
3. Be ethical, compliant, transparent, and accountable. With great power comes great responsibility. Make sure your AI practices are above board and transparent to everyone involved. Accountability is key.
4. Put people first. AI should be about enhancing the human experience, not replacing it. Keep people at the heart of your AI initiatives to create technology that truly serves society.
The bottom line is that I is powering more core applications than ever before, but it's not just about intelligence anymore. It's about making sure AI behaves within acceptable boundaries.
Signing off - Marco
Read more:
Well, that's a wrap for today. Tomorrow, I'll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ??
Think a friend would enjoy this too? Share the newsletter and let them join the conversation. LinkedIn appreciates your likes by making my articles available to more readers.
Top-rated articles:
Artistic Founder+CEO| Artisan-e ????????????
1 个月Great advice for CEOs implementing AI solutions. I was reading somewhere that model collapse is more likely than annihilation in the long term but I am wildly optimistic.??
Revolutionising Education with AI-Powered Personalised Learning Founder of Growth Engineering and Iridescent Technoloy
1 个月Great Article!
AI & ML Engineer || Digital Learning Facilitator || Full Stack Web Developer || Digital Marketing Consultant
1 个月Thanks for sharing such an important piece of information. As an AI Engineer, these will I keep at heart and also share with CEOs so as to secure the future with whatever we're building at the moment! I guess we'll need an AI Security Analyst whose role will be to review potential threats before we deploy any models out there, and everyone will need to be periodically educated.
Consultora en Estrategias Digitales para Abogados | Marketing Jurídico | Legaltech | Gestión de Proyectos Legales | Estrategia Digital B2B-B2C
1 个月Muchas gracias a por compartir este valioso análisis y por su contribución al entendimiento de los riesgos y desafíos que enfrenta la IA.