SEADialogues: A Multilingual Culturally Grounded Multi-turn Dialogue Dataset on Southeast Asian Languages

A culturally grounded dialogue dataset and benchmark for SEA languages.

Mentors:

Genta Indra Winata Ekapol Chuangsuwanich Alham Fikri Aji Fajri Koto Peerat Limkonchotiwat

Mentees (4):

Muhammad Dehan Al KautsarAswin CandraMuhammad Alif Al HakimMaxalmina Satria Kahfi

Resources:

Project Proposal

The problem

Underrepresented languages often perform worse and fail to align with local norms and expectations in multi-turn conversations.

The approach

The team built a culturally grounded, multi-turn dialogue dataset for SEA languages. They collected, curated, and annotated conversations that reflect local cultural practices.

The dataset supports both training and evaluation of value-aware conversational models.

Why this matters

SEADialogues provides a targeted resource for improving conversational AI in SEA languages, a benchmark for evaluating value alignment, and a starting point for future work on dialogue safety and social norms in SEA contexts.