While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. © 2025 Societa Editrice Il Mulino. All rights reserved.

ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA

Khlif N.
;
Nahli O.
2025

Abstract

While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. © 2025 Societa Editrice Il Mulino. All rights reserved.
2025
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
DiMorph; Moroccan dialect; morphological analyzer
File in questo prodotto:
File Dimensione Formato  
VersionAjournée_Enhancing_Arabic_Dialect_Analysis__Introducing_DiMorph_for_Darija.pdf

embargo fino al 30/11/2026

Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 499.59 kB
Formato Adobe PDF
499.59 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/563230
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact