Low-resource languages

Towards high-quality dataset acquisition and benchmark development for low-resource African Languages: A Nigerian Language Case Study

The goal of this project is to provide a high-quality monolingual and parallel sentence pair dataset for select major Nigerian languages (Igbo, Yoruba, and Pidgin). In addition, we develop benchmarks for the study of machine translation tasks of low-resource Nigerian languages.