Tipo de contenido material para medios audiovisuales:
Comienzo del material para medios audiovisuales:
Duración del material para medios audiovisuales:
Multilingual Large Language Models (MLLMs) have achieved remarkable success in advancing multilingual natural language processing, enabling effective knowledge transfer from high-resource to low-resource languages. Despite their achievements, MLLMs still face numerous issues and challenges, which can be categorized into three main aspects: corpora, alignment, and bias.
To address these challenges, a research team led by Professor Xu Yue-Mei from Beijing Foreign Studies University published a survey titled "A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias" on 15 November 2025 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The study reviews the challenges faced by MLLMs through three key aspects: corpora, alignment, and bias. It starts by presenting an overview of MLLMs, their evolution, techniques, and multilingual capabilities. Then, it explores the role of multilingual corpora and downstream datasets in enhancing model performance. The paper also examines the difficulty MLLMs face in learning universal language representations and reviews current approaches to multilingual alignment. Finally, it discusses the bias present in MLLMs, its categories, evaluation metrics, and debiasing techniques to address harmful outcomes.
Through an in-depth analysis of these dimensions, the researchers aim to shed light on practical strategies for optimizing MLLMs, offering valuable insights for the future development of fairer and more robust multilingual models.
DOI:10.1007/s11704-024-40579-4