SEACrowd: A Multilingual Multimodal Data Hub and Benchmark
NLP MultimodalComprehensive collection of Southeast Asian language datasets and benchmarks for NLP tasks.
We open-source datasets, dataloaders, models, and other resources from our projects and flagship apprenticeship program. Follow our HuggingFace or Github for updates.
For resources associated with our papers and preprints, please visit our publications page and look for “Resources” link. We also curate papers by our affiliated members that advance our mission of building AI to represent Southeast Asia.