MLA-RAGAS
MLARAGAS (Multilingual Augmented RAGAS) is a framework designed to evaluate Retrieval-Augmented Generation (RAG) systems across multiple languages with greater rigor and transparency.
It supports evaluations in diverse linguistic settings by introducing language-aware scoring, translation robustness checks, and cross-lingual consistency testing. These features help identify critical failure modes, including language drift, asymmetric evidence use, or fidelity loss when shifting from high-resource to low-resource languages.
By quantifying how reliably a RAG pipeline performs for different languages, MLARAGAS enables researchers and organizations to make informed decisions on data curation, retrieval strategies, and model selection before deployment.
An output of MLARAGAS is a comparative evaluation of multilingual RAG pipelines.

About the project
MLARAGAS extends the classic RAGAS framework by augmenting its core metrics—faithfulness, answer relevancy, and context precision/recall—with multilingual capabilities.
The project is developed within the AI Readiness and Assessment research group (AIRA).
Relevant scientific literature on this project will be made available soon.
For further information, feel free to contact us.

