登录查看更多内容

#54: From WEIRD to Worldwide: Revolutionizing AI

Deepak Seth

Actionable and Objective Insights - Data, Analytics and Artificial Intelligence

发布日期: 2024年6月9日

Equitable AI: Raising the Bar for Non-English Language Models

LLM Developers, beware! Overstating the multilingual capabilities of AI models can lead to significant risks in non-English contexts. From inaccurate information to failing to moderate harmful content, the consequences are real. To address this, follow these crucial steps:- Avoid assuming training in one language transfers to others; Include unique benchmarks for specific languages; Use non-machine translated benchmarks; Disclose volume and sources of training data per language; Test for vulnerabilities in non-English languages

Foundation model developers claim impressive performance across multiple languages, but these claims often fall short, especially for "low-resource" languages with limited training data. Models are predominantly tested in English, with fewer and less robust non-English benchmarks. This disparity risks inappropriate deployment in non-English contexts, potentially causing issues like misleading information or inadequate content moderation.

CDT has previously highlighted the limitations of multilingual LLMs in non-English languages and suggested improvements. Now, recommendations are made to foundation model developers to enhance non-English benchmarking and transparency:

Question Cross-Lingual Transfer Assumptions: Training models in one language doesn't ensure competence in others. Models trained mainly on English data perform modestly in low-resource languages due to "cross-lingual transfer," a theory still under debate. Developers shouldn't assume this transfer ensures model safety across languages.
Develop Unique Language Benchmarks: Most non-English benchmarks are translated from English, missing cultural nuances. Foundation model developers should create and use monolingual benchmarks for various languages to better assess performance in real-world contexts.
Avoid Sole Reliance on Machine-Translated Benchmarks: Machine-translated benchmarks can misrepresent real language use. Models should be tested with a mix of human-written, human-translated, and machine-translated texts to ensure accuracy across languages.
Disclose Training Data Details: Sharing information about the volume and sources of training data for each language helps developers fine-tune models for specific languages. Open weight models have been more transparent in this regard, setting an example for others.
Test Multilingual Vulnerabilities: Models can be compromised with translated adversarial prompts. Developers should engage in multilingual red-teaming to identify and address safety issues in all languages, not just English.

By adopting these practices, foundation model developers can ensure their models are reliable and effective across different languages, allowing for safer and more accurate applications worldwide.

From WEIRD to Worldly: Making AI Truly Global

Large language models (LLMs) have advanced significantly in generating and analyzing text. However, when comparing their performance to humans, it's crucial to ask, "Which humans?" Current literature often overlooks the cultural and psychological diversity of humans worldwide, which is not fully represented in the data LLMs are trained on.

This fascinating paper just out demonstrates that when AI researchers describe LLM performance by comparing with that of 'humans', they actually mean humans from WEIRD countries (Western, Educated, Industrialized, Rich and Democratic).

"We show that LLMs’ responses to psychological measures are an outlier compared with large-scale cross-cultural data, and that their performance on cognitive psychological tasks most resembles that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines rapidly as we move away from these populations (r = -.70). Ignoring cross-cultural diversity in both human and machine psychology raises numerous scientific and ethical issues."

Research shows that LLMs' responses to psychological tests are outliers when compared to diverse global data. Their performance closely mirrors that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines significantly with populations outside these groups (correlation of -0.70). This oversight of cross-cultural diversity in both human and machine psychology poses scientific and ethical concerns. The paper concludes by suggesting methods to reduce WEIRD bias in future LLMs.

No Language Left Behind: Revolutionizing Global Translation

Newly published in Nature: No Language Left Behind is an AI model created by researchers at Meta capable of translation between 200 languages — including low-resource languages. NLLB-200 includes 200 languages, contains three times as many low-resource languages as high-resource languages and performs 44% better than prior systems. This work aims to give people the opportunity to access and share web content in their native language, and communicate with anyone, anywhere, regardless of their language preferences.

Cohere 3 个月前

The Current #7: How AI is Reshaping Translation and…

New Enterprise Associates (NEA) 4 个月前

The Linguistic Bias of AI: Navigating Cultural…

Richard Foster-Fletcher ?? 4 个月前

Neural machine translation (NMT) has made significant strides, enabling translation between multiple languages and even zero-shot translation (translating between language pairs without direct examples). However, high-quality NMT typically requires large amounts of parallel bilingual data, which are not available for the world's 7,000+ languages. This focus on high-resource languages creates digital inequities by neglecting low-resource languages.

To address this, the No Language Left Behind project introduces a massively multilingual model leveraging transfer learning across languages. Using the Sparsely Gated Mixture of Experts architecture and new mining techniques for low-resource languages, the model was trained on vast data sets. Various improvements were implemented to prevent overfitting while training on thousands of tasks.

The model's performance was evaluated over 40,000 translation directions using specialized tools: the FLORES-200 automatic benchmark, the XSTS human evaluation metric, and a comprehensive toxicity detector. The model showed a 44% improvement in translation quality compared to previous state-of-the-art models, measured by the BLEU score.

By demonstrating how to scale NMT to 200 languages and making these resources freely available for non-commercial use, this work sets the stage for developing a universal translation system.

GenAI Use Case Comparison

Generative AI is an enabler of specific use cases for the IT function, and CIOs are tasked with weighing the specifics of their own IT organization before moving forward.Inform strategic conversations and guide investment decisions with Gartner's AI use-case comparison.

Signing Off

Why did the AI go to language school?

Because it couldn't find the right algorithm to "speak" human!

Keep an eye on our upcoming editions for in-depth discussions on specific AI trends, expert insights, and answers to your most pressing AI questions!

Stay connected for more updates and insights in the dynamic world of AI.

For any feedback or topics you'd like us to cover, feel free to contact me via LinkedIn.

DEEPakAI: AI Demystifed Demystifying AI, one newsletter at a time!

p.s. - The newsletter includes smart prompt based LLM generated content. The views and opinions expressed in the newsletter are my personal views and opinions.

DEEPakAI: AI Demystified

2,106 位关注者

Pete Grett

GEN AI Evangelist | #TechSherpa | #LiftOthersUp

5 个月

Fascinating insights on global inclusivity. Equitable AI is key for realizing its full potential. Deepak Seth

1 次回应

查看更多评论

要查看或添加评论，请登录

Deepak Seth的更多文章

#65: Eight Arms, One Perspective: An Octopus's Take on Human AGI Anxiety

2024年11月25日

#65: Eight Arms, One Perspective: An Octopus's Take on Human AGI Anxiety

Ah, the humans. Forever fretting about their latest existential dread: the rise of Artificial General Intelligence…

8 条评论
#64: DEEPakAI goes to Gartner IT Symposium/Xpo

2024年10月26日

#64: DEEPakAI goes to Gartner IT Symposium/Xpo

Just back from Gartner's flagship IT Symposium/Expo at Orlando. Quite a whirlwind of activities over 4 action-packed…

25 条评论
#63 From Consciousness to Creativity: AI’s Expanding Influence

2024年10月7日

#63 From Consciousness to Creativity: AI’s Expanding Influence

From "Artificial Intelligence" to "Artificial Consciousness"? Insightful and thought-provoking insights from the…

6 条评论
#62 Beyond Reason: Is AI Finally Thinking?

2024年9月22日

#62 Beyond Reason: Is AI Finally Thinking?

In 2019, the conversation around AI was dominated by both excitement and skepticism. The notion that AI could one day…

15 条评论
# 61: AI at a Crossroads: Innovation, Regulation, and Human Influence

2024年8月24日

# 61: AI at a Crossroads: Innovation, Regulation, and Human Influence

Are you looking to enhance your predictive and prescriptive analytics with Generative AI? This latest Gartner research…

15 条评论
#60 AI: Hype Cycles and Bubbles!

2024年8月3日

#60 AI: Hype Cycles and Bubbles!

Gartner ???????? ?????????? ?????? ???? ???? ???????????????? ??????????????????????, 2024 Back in 2018, I envisioned…
#59: The Global AI Perspective: Understanding, Innovating, and Regulating

2024年7月25日

#59: The Global AI Perspective: Understanding, Innovating, and Regulating

Survey Reveals Public's Complex Perception of AI's Impact and Regulation As artificial intelligence (AI) continues to…

6 条评论
#58: The AI Takeover: Beyond the Shiny Objects

2024年7月13日

#58: The AI Takeover: Beyond the Shiny Objects

On a hot summer day (which happened to be my birthday), I had the pleasure of joining Kevin Benedict on FOB TV, where…

7 条评论
#57 Revolutionizing AI: From Full-Body Avatars to Proteins Beyond Nature

2024年6月29日

#57 Revolutionizing AI: From Full-Body Avatars to Proteins Beyond Nature

F???????? ?????????? ???? ????????: ??????????????????'?? ?????? ???? ?????????????? ???????????????? ??????????????…
#56: Navigating AI: Energy Demands, Legal Advances, and Industry Moves

2024年6月22日

#56: Navigating AI: Energy Demands, Legal Advances, and Industry Moves

AI Boom Strains Power Grids, Leading to Increased Pollution and Reliability Issues The rapid expansion of Artificial…

12 条评论

See all articles

#54: From WEIRD to Worldwide: Revolutionizing AI

Deepak Seth

Actionable and Objective Insights - Data, Analytics and Artificial Intelligence

Equitable AI: Raising the Bar for Non-English Language Models

From WEIRD to Worldly: Making AI Truly Global

No Language Left Behind: Revolutionizing Global Translation

领英推荐

GenAI Use Case Comparison

Signing Off

DEEPakAI: AI Demystified

2,106 位关注者

Deepak Seth的更多文章

社区洞察

其他会员也浏览了

Latest In Web3, AI & Emerging Tech

Te Reo Māori revitalisation and adaption with AI

Think different, think local ??

Why AI should speak Spanish

Top RAG Papers of the Week (October Week 2, 2024)

NexTech ?? - Linksoft renewed the Solutions Partner designation, Microsoft is named a Gartner Leader, Google Research and 7000 languages!

How SUTRA, A Multilingual AI Model by Two AI Is Reshaping Language Processing in South Asian Markets

The Path Forward with Sovereign LLMs

“MULTI-LANGUAGE PLATFORMS,INTELLIGENT AGENTS, ARTIFICIAL INTELLIGENCE (AI) and ARTIFICIAL LANGUAGE”

[S2-AIGAI] How Generative AI helps people to conversate in different native languages

Equitable AI: Raising the Bar for Non-English Language Models

From WEIRD to Worldly: Making AI Truly Global

No Language Left Behind: Revolutionizing Global Translation

领英推荐

GenAI Use Case Comparison

Signing Off

DEEPakAI: AI Demystified

2,106 位关注者

Deepak Seth的更多文章

#65: Eight Arms, One Perspective: An Octopus's Take on Human AGI Anxiety

#64: DEEPakAI goes to Gartner IT Symposium/Xpo

#63 From Consciousness to Creativity: AI’s Expanding Influence

#62 Beyond Reason: Is AI Finally Thinking?

# 61: AI at a Crossroads: Innovation, Regulation, and Human Influence

#60 AI: Hype Cycles and Bubbles!

#59: The Global AI Perspective: Understanding, Innovating, and Regulating

#58: The AI Takeover: Beyond the Shiny Objects

#57 Revolutionizing AI: From Full-Body Avatars to Proteins Beyond Nature

#56: Navigating AI: Energy Demands, Legal Advances, and Industry Moves

社区洞察

其他会员也浏览了

Latest In Web3, AI & Emerging Tech

Te Reo Māori revitalisation and adaption with AI

Think different, think local ??

Why AI should speak Spanish

Top RAG Papers of the Week (October Week 2, 2024)

NexTech ?? - Linksoft renewed the Solutions Partner designation, Microsoft is named a Gartner Leader, Google Research and 7000 languages!

How SUTRA, A Multilingual AI Model by Two AI Is Reshaping Language Processing in South Asian Markets

The Path Forward with Sovereign LLMs

“MULTI-LANGUAGE PLATFORMS,INTELLIGENT AGENTS, ARTIFICIAL INTELLIGENCE (AI) and ARTIFICIAL LANGUAGE”

[S2-AIGAI] How Generative AI helps people to conversate in different native languages