Why AI should speak Spanish
Click here for the Spanish version of this post.
Artificial Intelligence is already here and it is going to be part of our lives. Its potential relies on understanding human language. The future of Artificial Intelligence will depend on its relationship with each and every one of us, and its ability to make our lives better.
Artificial Intelligence may already do some things just as well as humans, but it’s still learning to speak like us.
Language itself is changing. The impact of new technologies, the Internet and Artificial Intelligence are already affecting the Spanish language. We could be facing the emergence of a new language of the digital world and, when machines speak, a synthetic language. There are risks but also enormous opportunities. I had the opportunity to speak about this exciting topic at the VIII International Congress of the Spanish Language held in Córdoba, Argentina.
There are examples of language revolutions in the past. The irruption of printing press in the mid-XV century led to the democratisation of culture and language. A world of minorities, characterised by the cultured language versus common language dichotomy, was left behind. Now we find ourselves at the dawn of another revolution that makes the others pale in comparison. The digital revolution, driven by an unprecedented explosion of technology, is changing everything, including language. In the XXI century culture becomes globalised, its evolution speeds up, and digital language becomes the central axis for communication. Artificial Intelligence is nourished from these sources and learns to acquire language.
But, what is going on? How is this new language being set up? What does this phenomenon mean for the Spanish language? What should we do?
After Chinese, Spanish is the second most important language in the world. Spanish is spoken by 480 million people as their first language –20% more than English speakers. Spanish drops to fourth place in digital contents while English rises to the first place, with a weight ten times greater than our language. Other languages have simply ceased to exist in digital contents, or have a very limited impact. The prevalence of English in technology is also evident, and major companies and applications working in speech recognition, like Cloud Speech, Cortana, Siri, Alexa, or Echo, are American. We should ask ourselves, for which language is voice recognition and artificial intelligence optimised?
Our language is also exposed to other risks related to the data that Artificial Intelligence uses to learn how to speak. The less precise, more erroneous, or less relevant the data the more errors or even biases will be incorporated into synthetic language. This poses a threat to the future of our language that we cannot ignore if we want to ensure Spanish is used by future generations.
In the digital world, errors spread at high speed, and Artificial Intelligence learns from them. For example, cocreta (croquette, spelled croqueta) comes up in almost eighty thousand search references; the misuse of the imperative of the verb decir (to say), decirlo, yields more than two million results while its correct form, decidlo, shows a little more than three thousand. Words of less common use disappear or yield residual results. Abbreviations and inaccuracies used in messaging applications popularise the use of spelling or grammatical mistakes. Cultural gender biases are also present in searches. For example, if a user writes surgeon in English, the online translator shows cirujano (male surgeon); if the user writes nurse, the result will be enfermera (female nurse); if the user writes engineer, ingeniero (male engineer) comes up; and if the user writes nanny, the result will be ni?era (female nanny).
The most common word processors fail to recognize correct words, and when this happens they propose to correct and replace them, thus giving rise to an impoverishment of the lexicon. This is common with compound words or unusual words. You are easily forced to look for synonyms, or simply to eliminate perfectly valid words from a text, such as abanda (on the side), abrefácil (easy open), audioguía (audio guide), autodefenderse (self-defence), or avainillado (vanilla flavoured), just to name a few examples.
Finally, understanding human feelings relies on capturing emotions or the tone of expressions. But how will these be interpreted by Artificial Intelligence if it uses erroneous or biased texts?
The Spanish language has a tremendous challenge ahead. We all must put Spanish where it deserves to be, also in the digital world.
At Telefónica we are very aware that it is people that give purpose to technology, not the other way around. That is why we are promoting values for this new digital world we have to live in, like our Principles of Artificial Intelligence or the Manifesto For a New Digital Pact. But this challenge is a responsibility of all of us who use and benefit from our common language.
The Spanish language should occupy a prominent position in global conversation and in digital contents. We need to urgently understand how the synthetic language that we interact with is made when, for example, we ask voice assistants about news of the day, or use messaging applications and translate texts using digital translators. We need to urgently develop Artificial Intelligence based on our language, like Telefónica’s Aura, applying the criteria of the Spanish Royal Academy of Language to the newly born Artificial Intelligence synthetic language.
It is time to open the doors of the Academy in order to place one more seat. A seat for Artificial Intelligence. The change of era requires new rules, and new rules bring new ways to address problems. The synthetic language is learning, and we have the tools, knowledge, and experts to teach it properly.
Ingeniero de Caminos, dise?ador/músico, mediador, curioso, social, PMP
5 年Interesantisimo artículo. Algunas corrientes lícitas pero improductivas están complicando el lenguaje lo que tiene efectos favorables y otros nocivos. Habría que pensarlo todo cuando se implementan políticas linguísticas. Desde luego el hecho de que la AI venga de paises anglosajones es la piedra angular del dominio inglés pero aun estamos a tiempo. En el mundo digital queda todo por hacer. Solo hay que elegir una parcela "virgen" y desarrollarla en castellano y ponerla a disposición del mundo. El resto vendrá por a?adidura.
Muy atinada observaciones. Hay que prestar a este tema gran atención y fomentar una investigación activa sobre posibilidades de actuación para a) participar activamente en el proceso de integración del espa?ol en la corriente y b) defender, en efecto, el respeto a un lengua a je correcto
Director
5 年Keep up in doing such great work in advising Chile's SMEs!
Senior Partner Sales Manager @AWS | Driving Cloud Growth in Iberia
5 年We are forgetting something relevant in the formula CHINA. From 2017! "China’s government put out its plan to lead the world in AI by 2030. As Eric Schmidt has explained, “it’s pretty simple. By 2020, they will have caught up. By 2025, they will be better than us. By 2030, they will dominate the industries of AI.” And the figures don’t lie. With a $14 trillion GDP, China is predicted to account for over 35 percent of global economic growth from 2017 to 2019 — nearly double the U.S. GDP’s predicted 18 percent. And AI is responsible for a big chunk of that. PricewaterhouseCoopers recently projected AI’s deployment will add $15.7 trillion to the global GDP by 2030, with China taking home $7 trillion of that total, dwarfing North America’ $3.7 trillion in gains. In 2017, China accounted for 48 percent of the world’s total AI startup funding, compared to America’s 38 percent. Already, Chinese investments in AI, chips and electric vehicles have reached an estimated $300 billion. Meanwhile, AI giant Alibaba has unveiled plans to invest $15 billion in international research labs from the U.S. to Israel, with others following suit. Beijing has now mobilized local government officials around AI entrepreneurship and research, led by billions in guiding funds and VC investments. And behind the scenes, a growing force of driven AI entrepreneurs trains cutting-edge algorithms on some of the largest datasets available to date." https://www.diamandis.com/blog/rise-of-ai-in-china and it has happened 1 year earlier. So it is accelerating ?https://www.theverge.com/2019/3/14/18265230/china-is-about-to-overtake-america-in-ai-research