Cynthia Jayne Amol: My vision for African NLP

Cynthia Jayne Amol: My vision for African NLP

What is your vision for African natural language processing (NLP)?

  1. Equity in NLP. Despite recent efforts in advancing NLP, Africa is yet to catch up to the rest of the world. This is evident as most African languages- estimated at more than 2000- are considered low-resource. Moreover, recent foundation and Large Language Models (LLMs) are trained on few to none of indigenous African languages. Achieving equitable and inclusive NLP requires joint efforts by both local community members (creators of NLP resources) and organizations.??
  2. Community-centered NLP. I envision researchers working on more projects that are focused on solving both current and future community problems. This can be achieved through:?

Building high quality, diverse datasets in African languages. Local communities are custodians of valuable data. Collaborative efforts to engage the community in sustainable and inclusive data collection methods will increase the availability of these resources for researchers building community-centric solutions.

  • Development of usable NLP models in the African context. The challenges faced by most African countries may be unique to most parts of the world. Despite the development of ‘one-size-fits-all’ models for automation in various fields, these models often do not fit well for local problems. Recently, I tested solutions such as UlizaMama by Jacaranda Health which offers Kenyan mothers with quality healthcare related information and KenCorpus which has resources such as KenTrans for machine translation from high-resource languages to low-resource Kenyan languages. Solutions such as these are the future of African NLP and provide a great foundation for current and future researchers.?
  • Increased efforts to preserve African languages. Efforts by communities like Masakhane which foster research in indigenous African languages are a model for Africa’s future, especially in NLP. There has been an interest in African languages with big tech incorporating more African language data in recent versions of their models. Creating large repositories of resources in African languages will help in exposing the world to authentic African culture through language, resulting in preservation of these languages. This will also boost research efforts in creating Machine Translation (MT), Speech-to-text (STT) and Automatic Speech Recognition (ASR) models for African languages.


Podcast recommendation


Methali

17. Dalili ya mvua mawingu. Clouds are the sign of rain


Quote

“Trees are living symbols of peace and hope. A tree has roots in the soil yet reaches to the sky. It tells us that in order to aspire we need to be grounded...” ― Wangari Maathai, Unbowed

Andrew Kipkebut PhD

ML | Deep Learning | Large Language Models | Data Science | Cloud Computing | Cyber Security

1 周

Insightful

回复

要查看或添加评论,请登录