How AI can help Indigenous language revitalisation, and why data sovereignty is important
Dr Karaitiana Taiuru JP, MInstD
AI/Data and Emerging Tech Governance; IP & Critical Indigenous Researcher. Only personal opinions are expressed on this profile.
Using the interview with Michael Running Wolf at https://www-cbc-ca.cdn.ampproject.org/c/s/www.cbc.ca/amp/1.7290740 , I offer some commentary and key warnings for Māori language that I am already seeing occurring here in Aotearoa New Zealand.
?
"Indigenous language experts working in computer science say Artificial Intelligence is a useful tool in language revitalization but communities must prioritize the ownership of their data."
In New Zealand we are unique in the fact that the Māori language is New Zealand's first official language. I also argue that the Māori language is perhaps one of the most published and digitised indigenous languages in the world. We can not claim ownership of much of what has been digitised. But we can create data that is by Māori for Māori and then ensure some sovereignty and ownership over those data sets. We also need to be careful in what we digitise going forward and ensure there are copyright protections and restrictions from unauthorised usage.
Already we have seen several international companies using the digitised Māori language resources to train their own AI and then offer those systems back to companies as an AI translation program. There are several non Māori companies in New Zealand who are also using this data to train train their own AI systems.
While Māori do have at least one Māori owned company that are a significant player in AI and Māori language, the incentives to share and licence their data in the current ecosystem is likely to be against them.
?
"It's just going to be like a pencil. It's useful but it's not going to save our language."
Yes, we know from international research by linguists such as Joshua Fishman, that it takes an intergenerational approach. But, an AI is not able to speak on the marae or nurture of children in te reo Māori, but it will assist with other physical initiatives and as a living language. I think we are likely to see less and less Māori language learning resources in the future as they are replaces by Generative AI.
?
领英推荐
"Also, languages such as Cheyenne and Blackfeet are polysynthetic and fusional, meaning prefixes and suffixes blend into words so the roots are not apparent.
Many Māori language speakers are telling me their concerns with AI is that the AI can never understand the true intent of some words and phrases and not have any emotional feel to the words. While this is true now, it may change. But these issues and many others that are yet to be discovered through testing and development will need to be considered. This can not and should not be done in isolation by academics and engineers but within Māori language speaking communities.
?
"We have to have our own engineers. We need to have our own computer scientists using the software … We need to have sovereignty over our own data, set the terms and that's the only way to build this AI."
?
This advise is the same for Māori. We need to ensure we create pathways for education in IT and in particular computer science, machine learning and all areas of AI.? There are many reasons why statically Māori make up about 5% of the tech industry and 0.16% of the AI workforce, we just need to work on removing those barriers and making clear pathways.
Sovereignty over our own data is much more easier in New Zealand for Māori than other Indigenous Peoples due to Te Tiriti o Waitangi and the Waitangi Tribunal findings that Māori Data is a Taonga and subject to Māori Data Governance. In addition to an ever increasing recognition and appreciation within government and the international cloud providers of Māori Data Sovereignty. It really is up to Māori to use these opportunities and partnerships to seek Data Sovereignty and to create the opportunities in the tech industry to welcome and grow our next generation of digital kaitiaki.
?
As Māori we need to take note of Runnings Wolf's warnings experience and not leave it to the language experts alone, not academia but we need to look at a multiprong approach to sovereignty of language data sets, upskilling more Māori into AI jobs and protect future data sets of the Māori language.
AI in my analysis is already an important tool on the preservation of all indigenous languages and up until now has progressed haphazardly by early innovators. I have pondered at length on your "data sovereignty" misgivings. However the following excerpt from this interview you shared has given the clarity to align my thinking with yours. "Lakota elders helped a white man preserve their language. Then he tried to sell it back to them. “No matter how it was collected, where it was collected, when it was collected, our language belongs to us," said Ray Taken Alive, a Lakota teacher". Thank you for your ongoing efforts to highlight the many pitfalls in keeping sovereignty over our precious reo.