Projects Mentored
Language Surgery in Multilingual Large Language Models
A technique for controlling language use in multilingual LLMs without retraining.
Mentors Samuel Cahyawijaya, Genta Indra Winata, Fajri Koto, Peerat Limkonchotiwat, Alham Fikri Aji
Mentees (3) Joanito Agili Lopo, Muhammad Ravi Shulthan Habibi, Tack Hwa Wong
Entropy2Vec: Crosslingual Language Modeling Entropy as End-to-End Learnable Language Representations
Learning language embeddings from model entropy to recover typological structure.
Mentors Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya
Mentees (4) Patrick Amadeus Irawan, Ryandito Diandaru, Belati Jagad Bintang Syuhada, Randy Zakya Suchrady
SEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian Languages
A culturally grounded dialogue dataset and benchmark for SEA languages.
Mentors Genta Indra Winata, Ekapol Chuangsuwanich, Alham Fikri Aji, Fajri Koto, Peerat Limkonchotiwat
Mentees (4) Muhammad Dehan Al Kautsar, Aswin Candra, Muhammad Alif Al Hakim, Maxalmina Satria Kahfi
CoRaL: Contextual Relevance and Linguistic Enrichment
A multi-dimensional data curation framework to balance quality, relevance, and cultural coverage in low-resource corpora.
Mentors Fajri Koto, M. Dehan Al-Kautsar
Mentees (3) Thanh-Nhi Nguyen, Feliks Victor Parningotan Samosir, Michael Christlambert Sinanta