My lab works across security, language, and health — building systems that are trustworthy and equitable. Click any area to learn more.
Nigeria is one of the most populous and linguistically diverse countries in Africa, with over 500 languages spoken. Yet the vast majority of NLP research focuses on high-resource languages like English and Mandarin, leaving Nigerian language speakers severely underserved by modern AI tools.
Our work in this area focuses on creating high-quality, machine-readable datasets for low-resource Nigerian languages — particularly Igbo and Nigerian Pidgin — and developing neural machine translation (NMT) models that can bridge these languages with English and other widely-spoken languages.
We survey the landscape of machine translation research on Nigerian languages, identify gaps in existing datasets and methodologies, and propose future directions for growing the community of researchers working on African language technologies.
Health informatics sits at the intersection of computing and public health, leveraging data-driven methods to improve outcomes, inform policy, and understand population-level trends. My research in this area applies modern NLP techniques to real-world health datasets.
One focus area is understanding public sentiment around vaccines — particularly COVID-19 vaccines — by analyzing social media data across geographic regions. This work helps public health communicators understand hesitancy patterns and tailor their messaging accordingly.
We are also exploring predictive models for health forum data, working to identify early indicators of disease exacerbation in patient communities discussing conditions like asthma, and developing privacy-preserving techniques for sensitive medical data.
Modern vehicles and IoT deployments are increasingly interconnected, making them attractive targets for cyberattacks. Controller Area Network (CAN) buses — the communication backbone of most vehicles — were designed without security in mind, leaving them vulnerable to injection and replay attacks.
Our work applies large language models (CANBERT) to detect intrusions on in-vehicle networks by treating CAN bus traffic as a language and learning normal communication patterns. Anomalies in this 'language' signal potential attacks.
We extend this to broader IoT settings using graph-based representation learning to model device communication patterns, and federated learning to enable collaborative anomaly detection across devices without sharing raw data — preserving privacy while improving detection accuracy.
Data provenance — the ability to trace the origin and history of data as it moves through a system — is a powerful tool for security and accountability. In cyber-physical systems (CPS) and IoT environments, where devices are often resource-constrained and interconnected, provenance tracking introduces unique challenges.
My doctoral work established trace-based provenance collection frameworks for IoT devices, enabling lightweight capture of data flow information even on embedded systems with limited memory and processing power.
By modeling provenance as graphs, we can apply anomaly detection algorithms to identify unusual data flows that may indicate compromise, misconfiguration, or attack — providing a fundamentally new lens for CPS security that complements traditional signature-based approaches.
Federated learning enables multiple devices to collaboratively train machine learning models without sharing their raw data — a crucial property in settings where privacy matters and data is sensitive. At the mobile edge, however, participating devices face severe constraints: limited battery life, unstable connectivity, and heterogeneous hardware.
FedCime, one of our contributions in this space, addresses the challenge of efficient federated learning for mobile edge clients by reducing communication overhead and adapting to varying client capabilities without sacrificing model quality.
We also apply federated learning to vehicular edge networks, where vehicles cooperate on tasks like task offloading decisions using deep reinforcement learning — enabling energy-efficient collaborative intelligence across a highly dynamic network topology.
Many IoT deployments rely on microcontrollers and embedded systems with kilobytes of RAM and flash storage — far too limited for the complex workloads modern applications demand. Dynamic load sharing offers a path forward: intelligently distributing computation across devices in a network to collectively handle tasks that no single device could manage alone.
Our survey work maps the landscape of existing load-sharing approaches for memory-constrained devices, identifying gaps in current techniques and opportunities for improvement. We examine strategies ranging from task migration to cooperative caching.
Building on this, we design and simulate load-sharing protocols that account for the real-world constraints of IoT environments: intermittent connectivity, heterogeneous hardware, and strict energy budgets — enabling richer functionality without requiring hardware upgrades.