Why Does AI Use Old-Fashioned Words?!
AI is ever evolving. Have you noticed that sometimes it comes out with weird things? Like using old-fashioned words, such as "delve," or "spearheaded." This had me really puzzled for a long time. I wondered why AI insisted on using these archaic words. So I started to look into it.
The Outsourcing Influence
AI models, especially those trained for natural language processing (NLP), rely on vast amounts of text data, gathered from a range of sources. This data isn’t just curated by native speakers from English language countries. In fact, the quality assurance (QA) work is often outsourced to nations, like India, the Philippines, and Kenya. These countries use the English, although they often use a more formal tone, due to their legacy of western colonial education.
The “Jim the AI Whisperer†Insight
Jim the AI Whisperer, is a figure in the AI community. He has identified that the word "delve," is a dead giveaway that text is AI-generated. His thinking is that such old fashioned, literary words are common in AI, due to the feedback from QA processes in the above countries. When QA teams use words like "delve" or "spearheaded" in their feedback, these terms find their way into the training data and so they become part of the AI's vocabulary.
Formality in Feedback
Feedback provided during the QA phase of training AI models, often uses a formal tone. This isn’t just a cultural throwback, it's a deliberate approach to ensure clarity and precision. This formality inadvertently leads AI systems to adopt a style which includes old fashioned terms. It’s like learning English from classic literature, instead of modern sources.
The Historical Linguistic Landscape
English is the main global language. It has many variants, which have been influenced by respective countries' histories and educational systems. In some countries, English education emphasises classical literature, which features old fashioned language. When AI models learn from datasets influenced by such educational backgrounds, they pick up these linguistic nuances.
领英推è
The Impact of Training Data
Training data for AI comes from a range of sources, including books, articles, and websites. Historical texts and formal documents are part of these data sources. So, words which we find to be out of place, in normal conversations, find their way into AI outputs. This diverse mix of sources ensures a comprehensive understanding of language, but also introduces a blend of old and new terminologies.
Balancing Modernity and Tradition
While it’s certainly intriguing to find AI using words from the past, developers are constantly refining these models to strike a balance. The goal is to make AI responses as natural and contemporary as possible, without losing the richness of language. As AI technology advances, we can expect even more nuanced and context-aware language use.
Final Thoughts
So, next time you see an AI-generated output containing "delve" or "spearheaded," you’ll know there’s a fascinating blend of history, global influence and linguistic diversity at play. It's a reminder of how interconnected and rich our global language is.
For more insights into AI and language, please reach out to me.
#AI #LanguageModel #NaturalLanguageProcessing #AIQuirks #TechInsights #GaryHalstead #LifeCoaching