Arabic Text Formality Transfer
Published in Computational Sciences and Arts & Humanities
Large language models (LLMs) have achieved remarkable success in a wide array of natural language processing (NLP) tasks, including text style transfer and machine translation. One particularly important application is text formality transfer, where informal or dialectal language is converted into a formal register, typically Modern Standard Arabic (MSA) in Arabic-language contexts. While considerable research has been devoted to English and other high-resource languages, Arabic remains underexplored, primarily due to its rich morphological structure, dialectal variation, and the scarcity of annotated parallel corpora.
This video blog provides a detailed overview of our research on evaluating Arabic-based LLMs — including Jais, AceGPT, ArabianGPT, and LLaMA — for their ability to translate Arabic dialects (ADs) into MSA. The study is motivated by the fact that most prior evaluations of LLMs have focused either on English or on English–MSA translation, leaving a gap in understanding how well these models perform when dealing with intra-Arabic language variation.
To address this, we conducted a series of experiments using four publicly available datasets that include rich dialectal content: MADAR, MDC, PADIC, and BIBLE. These datasets encompass a variety of dialects across regions and domains, offering a robust testing ground for evaluating model performance. Our methodology included zero-shot, few-shot, and in-context fine-tuning learning paradigms — simulating practical usage scenarios from low-data setups to more guided translation tasks.
Key performance metrics were used to measure translation quality, including BLEU, COMET, ChrF1, and BERTScore. Our findings reveal that Jais and AceGPT consistently outperform other models, including the widely-used LLaMA, across all metrics and evaluation settings. This performance gap highlights the importance of pretraining on Arabic text, which both Jais and AceGPT benefit from. In contrast, LLaMA, which is predominantly trained on English data, struggles with capturing the nuanced structures of Arabic dialects.
These results not only emphasize the need for LLMs tailored to low-resource languages, but also highlight the linguistic and cultural richness of Arabic as a testbed for NLP research. By focusing on dialect-to-MSA translation — a task with direct implications for social media processing, customer service, digital archiving, and educational tools — this study contributes meaningful insights to both academic and applied research communities.
References & Further Reading
-
Abdu, F., Mughaus, R., Abudalfa, S., Ahmed, M., & Abdelali, A. (2025). An empirical evaluation of Arabic text formality transfer: A comparative study. Language Resources and Evaluation. Springer Nature.
-
Abudalfa, S., Abdu, F., & Alowaifeer, M. (2024). Arabic text formality modification: A review and future research directions. IEEE Access.
- Kadaoui, K., Magdy, S. M., Waheed, A., Khondaker, M. T. I., El-Shangiti, A. O., Nagoudi, E. M. B., & Abdul-Mageed, M. (2023). Tarjamat: Evaluation of Bard and ChatGPT on machine translation of ten Arabic varieties. arXiv preprint arXiv:2308.03051.
- Zhang, X., Rajabi, N., Duh, K., & Koehn, P. (2023). Machine translation with large language models: Prompting, few-shot learning, and fine-tuning with QLoRA. In Proceedings of the Eighth Conference on Machine Translation (pp. 468–481).
- Derouich, W., Kchaou, S., & Boujelbane, R. (2023). ANLP-RG at NADI 2023 shared task: Machine translation of Arabic dialects—A comparative study of transformer models. In Proceedings of ArabicNLP 2023 (pp. 683–689).
- Slim, A., & Melouah, A. (2024). Low-resource Arabic dialects transformer neural machine translation improvement through incremental transfer of shared linguistic features. Arabian Journal for Science and Engineering, 1–17.
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in