What is your vision for African natural language processing (NLP)?
- Equity in NLP. Despite recent efforts in advancing NLP, Africa is yet to catch up to the rest of the world. This is evident as most African languages- estimated at more than 2000- are considered low-resource. Moreover, recent foundation and Large Language Models (LLMs) are trained on few to none of indigenous African languages. Achieving equitable and inclusive NLP requires joint efforts by both local community members (creators of NLP resources) and organizations.??
- Community-centered NLP. I envision researchers working on more projects that are focused on solving both current and future community problems. This can be achieved through:?
- Development of usable NLP models in the African context. The challenges faced by most African countries may be unique to most parts of the world. Despite the development of ‘one-size-fits-all’ models for automation in various fields, these models often do not fit well for local problems. Recently, I tested solutions such as UlizaMama by Jacaranda Health which offers Kenyan mothers with quality healthcare related information and KenCorpus which has resources such as KenTrans for machine translation from high-resource languages to low-resource Kenyan languages. Solutions such as these are the future of African NLP and provide a great foundation for current and future researchers.?
- Increased efforts to preserve African languages. Efforts by communities like Masakhane which foster research in indigenous African languages are a model for Africa’s future, especially in NLP. There has been an interest in African languages with big tech incorporating more African language data in recent versions of their models. Creating large repositories of resources in African languages will help in exposing the world to authentic African culture through language, resulting in preservation of these languages. This will also boost research efforts in creating Machine Translation (MT), Speech-to-text (STT) and Automatic Speech Recognition (ASR) models for African languages.
ML | Deep Learning | Large Language Models | Data Science | Cloud Computing | Cyber Security
1 周Insightful