Researchers have ranked AI models by risk. It was one big mess.

Researchers have ranked AI models by risk. It was one big mess.

Because everybody and their mothers are developing AI models nowadays, a couple of researchers have started ranking AI models based on their potential risks, and the results were pretty much.... "eye-opening" for me.

It turns out that not all AI is created equal when it comes to safety and ethical behavior.

Who would have thought?


Before we start!

If you like this topic and you want to support me:


  1. Comment on the article; that will really help spread the word ??
  2. Connect with me on Linkedin ??
  3. Subscribe to TechTonic Shifts to get your daily dose of tech ??


We have been hit by an AI tidal wave.

Literally.

For an article about AI-Agents I tried plotting all known AI-Agent platforms. After I reached 400 platforms, I decided to call it a day.

This is just benoculars.

And because AI is getting more and more integrated into our daily lives, a lot of people are starting to worry all of a sudden about the potential risks. I mean, we're talking about powerful systems that could potentially misbehave or cause harm if not properly regulated.

That's where experts like Bo Li come in (cool name!).

She is an associate professor at the University of Chicago who specializes in stress testing AI models to uncover vulnerabilities and potential misbehavior. She has become the go-to expert for consulting firms who are more concerned about how problematic AI models can be than how smart they are.

And that is a good thing !

Not only look at the midichlorians, but also take into account, the dark side of the force.

And since AI is powering more core applications than ever, it's not just about intelligence anymore; it's about ensuring AI behaves within acceptable boundaries


To tackle this issue head-on, Li and her colleagues from various universities and organizations developed a comprehensive taxonomy of AI risks.


They've categorized the ways AI models can go rogue and cause trouble.

I am talking about cybersecurity threats, misinformation, privacy violations, the whole shebang. They also created a benchmark called AIR-Bench 2024 to measure how different AI models perform when it comes to avoiding rule-breaking behavior.

Basically It's a report card for AI safety.

To build AIR-Bench 2024, the researchers went all out.

They analyzed government AI regulations and guidelines from the US, China, and the EU. They even reviewed the usage policies of 16 major AI companies worldwide, including the bigguns like Google, Microsoft, and OpenAI. The goal was to create a benchmark that reflects a wide range of regulatory and ethical standards, so we can really put AI to the test.

And wowawiewa, did they find some interesting results!

Different AI models handle risky scenarios very differently.

For example, Anthropic's Claude 3 Opus is a champ at refusing to generate cybersecurity threats, which is crucial in today's world of sophisticated cyber-attacks. On the other hand, Google's Gemini 1.5 Pro is particularly good at avoiding the generation of nonconsensual sexual content, which is a major concern for online safety. And then there's the DBRX Instruct model by Databricks, which scored the worst across various risk categories.

Ouch.

These findings have big implications for businesses and policymakers.

Companies looking to deploy AI need to understand the risk landscape and the strengths and weaknesses of specific AI models. For example, if a company wants to use an AI for customer service, they might prioritize a model's ability to avoid offensive language over its technical capabilities. It's all about finding the right fit for the job.


AIR Bench 2024

It also revealed interesting trends in AI development and regulation.


Tt turned out that government regulations are often less comprehensive than the policies set by companies themselves.


Hahahahahaha.... sorry, people. Just could not help myself.

This suggests that there's room for stricter regulations to keep AI in check.

The researchers also found that some AI models don't fully comply with their own company's policies, which means there's still a lot of work to be done to ensure model safety.


Other researchers are also working hard to catalog AI risks.

Just this week, two researchers at MIT unveiled a new database of AI dangers, compiled from 43 different AI risk frameworks. Neil Thompson, a research scientist at MIT, noted that many organizations are still in the early stages of adopting AI and need guidance on potential risks. In the end, this will definitely become a roadmap for navigating the AI minefield.

[Link to the database]


Improvements in capability outpaces advancements in safety.

One of the most concerning findings from Li's research is the apparent disconnect between the increasing capabilities of AI models and their safety. For instance, her company recently analyzed the largest and most powerful version of Meta's Llama 3.1 model. While the model was found to be more capable in many respects, it was not significantly safer than previous versions. This to me is a symptom of a broader trend in AI development, where improvements in capability outpaces advancements in safety. Think of it like building a faster car without upgrading the brakes.

This disconnect is particularly troubling given the potential for AI to be used in high-stakes environments, such as healthcare, or finance, and ummm... national security. In these contexts, even a small safety lapse can have catastrophic consequences. Li and her colleagues urge model developers to pay more attention to ensuring that AI safety improves in tandem with capability, rather than lagging behind.


I think the message that she and her colleagues is trying to convey, is that when companies are looking to develop models, or to deploy AI, they need to understand the risks associated with AI. Because this is just as important as understanding its capabilities.


And by doing so, they can be sure that they are using AI in a way that is not only doing what it is supposed to do, but also that it is safe to use (*that it will not lead to the doom of your company*), that it is ethical, and compliant with the law.

Let me wrap up this article with a heartfelt plea to all CEOs: Please, take a moment to think before you let AI run amok. After all, the last thing we need is an AI that suggests pineapple on pizza


Hey CEOs,

AI is moving fast, and we need to keep up. Ok, we know that. And the board is screeming for it too. We are with you. But it's not just about jumping on the bandwagon; it's about doing it right.

So, here are four key things you need to keep in mind:

1. Get your data in order. A solid data strategy, governance, and platform are the foundation of any successful AI initiative. Without it, even the fanciest AI models will fall flat.

2. Know the risks. AI is exciting, but it also comes with some serious risks. Don't make any big moves until you've thoroughly assessed and understood these risks.

3. Be ethical, compliant, transparent, and accountable. With great power comes great responsibility. Make sure your AI practices are above board and transparent to everyone involved. Accountability is key.

4. Put people first. AI should be about enhancing the human experience, not replacing it. Keep people at the heart of your AI initiatives to create technology that truly serves society.

The bottom line is that I is powering more core applications than ever before, but it's not just about intelligence anymore. It's about making sure AI behaves within acceptable boundaries.


Signing off - Marco


Read more:

  1. Research paper:"DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models"
  2. Presentation.
  3. AIR-Bench 2024 DecodingTrust Benchmark


Well, that's a wrap for today. Tomorrow, I'll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ??

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. LinkedIn appreciates your likes by making my articles available to more readers.


Top-rated articles:



Rassina Hassan??

Artistic Founder+CEO| Artisan-e ????????????

1 个月

Great advice for CEOs implementing AI solutions. I was reading somewhere that model collapse is more likely than annihilation in the long term but I am wildly optimistic.??

Juliette Denny

Revolutionising Education with AI-Powered Personalised Learning Founder of Growth Engineering and Iridescent Technoloy

1 个月

Great Article!

回复
Samuel Akinyemi ??

AI & ML Engineer || Digital Learning Facilitator || Full Stack Web Developer || Digital Marketing Consultant

1 个月

Thanks for sharing such an important piece of information. As an AI Engineer, these will I keep at heart and also share with CEOs so as to secure the future with whatever we're building at the moment! I guess we'll need an AI Security Analyst whose role will be to review potential threats before we deploy any models out there, and everyone will need to be periodically educated.

Anna M. Sells

Consultora en Estrategias Digitales para Abogados | Marketing Jurídico | Legaltech | Gestión de Proyectos Legales | Estrategia Digital B2B-B2C

1 个月

Muchas gracias a por compartir este valioso análisis y por su contribución al entendimiento de los riesgos y desafíos que enfrenta la IA.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了