#54: From WEIRD to Worldwide: Revolutionizing AI
Deepak Seth
Actionable and Objective Insights - Data, Analytics and Artificial Intelligence
Equitable AI: Raising the Bar for Non-English Language Models
LLM Developers, beware! Overstating the multilingual capabilities of AI models can lead to significant risks in non-English contexts. From inaccurate information to failing to moderate harmful content, the consequences are real. To address this, follow these crucial steps:- Avoid assuming training in one language transfers to others; Include unique benchmarks for specific languages; Use non-machine translated benchmarks; Disclose volume and sources of training data per language; Test for vulnerabilities in non-English languages
Foundation model developers claim impressive performance across multiple languages, but these claims often fall short, especially for "low-resource" languages with limited training data. Models are predominantly tested in English, with fewer and less robust non-English benchmarks. This disparity risks inappropriate deployment in non-English contexts, potentially causing issues like misleading information or inadequate content moderation.
CDT has previously highlighted the limitations of multilingual LLMs in non-English languages and suggested improvements. Now, recommendations are made to foundation model developers to enhance non-English benchmarking and transparency:
By adopting these practices, foundation model developers can ensure their models are reliable and effective across different languages, allowing for safer and more accurate applications worldwide.
From WEIRD to Worldly: Making AI Truly Global
Large language models (LLMs) have advanced significantly in generating and analyzing text. However, when comparing their performance to humans, it's crucial to ask, "Which humans?" Current literature often overlooks the cultural and psychological diversity of humans worldwide, which is not fully represented in the data LLMs are trained on.
This fascinating paper just out demonstrates that when AI researchers describe LLM performance by comparing with that of 'humans', they actually mean humans from WEIRD countries (Western, Educated, Industrialized, Rich and Democratic).
"We show that LLMs’ responses to psychological measures are an outlier compared with large-scale cross-cultural data, and that their performance on cognitive psychological tasks most resembles that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines rapidly as we move away from these populations (r = -.70). Ignoring cross-cultural diversity in both human and machine psychology raises numerous scientific and ethical issues."
Research shows that LLMs' responses to psychological tests are outliers when compared to diverse global data. Their performance closely mirrors that of people from Western, Educated, Industrialized, Rich, and Democratic (WEIRD) societies but declines significantly with populations outside these groups (correlation of -0.70). This oversight of cross-cultural diversity in both human and machine psychology poses scientific and ethical concerns. The paper concludes by suggesting methods to reduce WEIRD bias in future LLMs.
No Language Left Behind: Revolutionizing Global Translation
Newly published in Nature: No Language Left Behind is an AI model created by researchers at Meta capable of translation between 200 languages — including low-resource languages. NLLB-200 includes 200 languages, contains three times as many low-resource languages as high-resource languages and performs 44% better than prior systems. This work aims to give people the opportunity to access and share web content in their native language, and communicate with anyone, anywhere, regardless of their language preferences.
领英推荐
Neural machine translation (NMT) has made significant strides, enabling translation between multiple languages and even zero-shot translation (translating between language pairs without direct examples). However, high-quality NMT typically requires large amounts of parallel bilingual data, which are not available for the world's 7,000+ languages. This focus on high-resource languages creates digital inequities by neglecting low-resource languages.
To address this, the No Language Left Behind project introduces a massively multilingual model leveraging transfer learning across languages. Using the Sparsely Gated Mixture of Experts architecture and new mining techniques for low-resource languages, the model was trained on vast data sets. Various improvements were implemented to prevent overfitting while training on thousands of tasks.
The model's performance was evaluated over 40,000 translation directions using specialized tools: the FLORES-200 automatic benchmark, the XSTS human evaluation metric, and a comprehensive toxicity detector. The model showed a 44% improvement in translation quality compared to previous state-of-the-art models, measured by the BLEU score.
By demonstrating how to scale NMT to 200 languages and making these resources freely available for non-commercial use, this work sets the stage for developing a universal translation system.
GenAI Use Case Comparison
Generative AI is an enabler of specific use cases for the IT function, and CIOs are tasked with weighing the specifics of their own IT organization before moving forward.Inform strategic conversations and guide investment decisions with Gartner's AI use-case comparison.
Signing Off
Why did the AI go to language school?
Because it couldn't find the right algorithm to "speak" human!
Keep an eye on our upcoming editions for in-depth discussions on specific AI trends, expert insights, and answers to your most pressing AI questions!
Stay connected for more updates and insights in the dynamic world of AI.
For any feedback or topics you'd like us to cover, feel free to contact me via LinkedIn.
DEEPakAI: AI Demystifed Demystifying AI, one newsletter at a time!
p.s. - The newsletter includes smart prompt based LLM generated content. The views and opinions expressed in the newsletter are my personal views and opinions.
GEN AI Evangelist | #TechSherpa | #LiftOthersUp
5 个月Fascinating insights on global inclusivity. Equitable AI is key for realizing its full potential. Deepak Seth