While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. © 2025 Societa Editrice Il Mulino. All rights reserved.

ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA

Khlif N.
;
Nahli O.
2025

Abstract

While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. © 2025 Societa Editrice Il Mulino. All rights reserved.
Campo DC Valore Lingua
dc.authority.ancejournal LINGUE E LINGUAGGIO en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Khlif N. en
dc.authority.people Mazroui A. en
dc.authority.people Nahli O. en
dc.collection.id.s b3f88f24-048a-4e43-8ab1-6697b90e068e *
dc.collection.name 01.01 Articolo in rivista *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2026/03/03 15:20:41 -
dc.date.available 2026/03/03 15:20:41 -
dc.date.firstsubmission 2026/01/16 11:10:22 *
dc.date.issued 2025 -
dc.date.submission 2026/02/03 11:57:20 *
dc.description.abstracteng While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. © 2025 Societa Editrice Il Mulino. All rights reserved. -
dc.description.allpeople Khlif, N.; Mazroui, A.; Nahli, O. -
dc.description.allpeopleoriginal Khlif N.; Mazroui A.; Nahli O. en
dc.description.fulltext embargoed_20261130 en
dc.description.numberofauthors 3 -
dc.identifier.doi 10.1418/116951 en
dc.identifier.isi 105023642942 en
dc.identifier.scopus 2-s2.0-105023642942 en
dc.identifier.source scopus *
dc.identifier.uri https://hdl.handle.net/20.500.14243/563230 -
dc.language.iso eng en
dc.relation.firstpage 363 en
dc.relation.issue 2 en
dc.relation.lastpage 390 en
dc.relation.numberofpages 28 en
dc.relation.volume 24 en
dc.subject.keywordseng DiMorph; Moroccan dialect; morphological analyzer -
dc.subject.singlekeyword DiMorph *
dc.subject.singlekeyword Moroccan dialect *
dc.subject.singlekeyword morphological analyzer *
dc.title ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA en
dc.type.driver info:eu-repo/semantics/article -
dc.type.full 01 Contributo su Rivista::01.01 Articolo in rivista it
dc.type.miur 262 -
iris.isi.metadataErrorDescription 0 -
iris.isi.metadataErrorType ERROR_NO_MATCH -
iris.isi.metadataStatus ERROR -
iris.mediafilter.data 2026/03/04 02:52:18 *
iris.orcid.lastModifiedDate 2026/03/03 15:20:41 *
iris.orcid.lastModifiedMillisecond 1772547641829 *
iris.scopus.extIssued 2025 -
iris.scopus.extTitle ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.metadataCallLastModified 04/03/2026 04:33:51 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1772595231042 -
iris.unpaywall.metadataErrorDescription 0 -
iris.unpaywall.metadataErrorType ERROR_NO_MATCH -
iris.unpaywall.metadataStatus ERROR -
scopus.authority.ancejournal LINGUE E LINGUAGGIO###1720-9331 *
scopus.category 1203 *
scopus.category 3310 *
scopus.contributor.affiliation National Research Council -
scopus.contributor.affiliation Computer Science Research Laboratory BV Mohammed VI -
scopus.contributor.affiliation National Research Council -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60032804 -
scopus.contributor.afid 60021199 -
scopus.contributor.auid 57731783300 -
scopus.contributor.auid 56014310300 -
scopus.contributor.auid 56741333300 -
scopus.contributor.country Italy -
scopus.contributor.country Morocco -
scopus.contributor.country Italy -
scopus.contributor.dptid 104078586 -
scopus.contributor.dptid -
scopus.contributor.dptid 104078586 -
scopus.contributor.name Nadia -
scopus.contributor.name Azzedine -
scopus.contributor.name Ouafae -
scopus.contributor.subaffiliation Institute for Computational Linguistics; -
scopus.contributor.subaffiliation Faculty of Sciences;Mohammed First University; -
scopus.contributor.subaffiliation Institute for Computational Linguistics; -
scopus.contributor.surname Khlif -
scopus.contributor.surname Mazroui -
scopus.contributor.surname Nahli -
scopus.date.issued 2025 *
scopus.description.abstracteng While Modern Standard Arabic (MSA) is well-studied, dialectal Arabic texts, such as Moroccan dialect (Darija), pose unique challenges due to their informal structure and lack of a standardized grammar. In this paper, we provide an in-depth study of Darija detailing its morphological and syntactic features, and we introduce DiMorph (Dialectal Morphological Analyzer), a specialized morphological engine, which is designed to address these complexities automatically. In detail, we focus on DiMorph’s multi-phase approach, involving both pre- and post-processing phases. Such approach effectively manages dialectal variability and achieves high accuracy in token recognition and analysis, particularly in social media contexts. Finally, we underscore the importance of developing tools that respect the linguistic diversity of Arabic dialects, laying a strong foundation for advanced computational research in Arabic dialectology. *
scopus.description.allpeopleoriginal Khlif N.; Mazroui A.; Nahli O. *
scopus.differences scopus.subject.keywords *
scopus.differences scopus.description.abstracteng *
scopus.document.type ar *
scopus.document.types ar *
scopus.identifier.doi 10.1418/116951 *
scopus.identifier.eissn 2612-0488 *
scopus.identifier.pui 2042055001 *
scopus.identifier.scopus 2-s2.0-105023642942 *
scopus.journal.sourceid 19700200936 *
scopus.language.iso eng *
scopus.publisher.name Societa Editrice Il Mulino *
scopus.relation.firstpage 363 *
scopus.relation.issue 2 *
scopus.relation.lastpage 390 *
scopus.relation.volume 24 *
scopus.subject.keywords DiMorph; Moroccan dialect; morphological analyzer; *
scopus.title ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA *
scopus.titleeng ENHANCING ARABIC DIALECT ANALYSIS: INTRODUCING DIMORPH FOR DARIJA *
Appare nelle tipologie: 01.01 Articolo in rivista
File in questo prodotto:
File Dimensione Formato  
VersionAjournée_Enhancing_Arabic_Dialect_Analysis__Introducing_DiMorph_for_Darija.pdf

embargo fino al 30/11/2026

Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 499.59 kB
Formato Adobe PDF
499.59 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/563230
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact