An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked

Creating Arabic Lexical Resources in TEI; A Schema for Discontinuous Morphology Encoding

Ouafae Nahli
;
Angelo Mario Del Grosso
2021

Abstract

An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked
2021
Istituto di linguistica computazionale "Antonio Zampolli" - ILC
Inglese
2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20)
IEEE-CIST2020 DPWH
178
187
10
978-1-7281-6646-9
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9357273
IEEE
345 E 47TH ST, NEW YORK, NY 10017 USA
STATI UNITI D'AMERICA
Comitato scientifico
5/06/2021 - 12/06/2021
Agadir - Essaouira, Morocco
Internazionale
classical Arabic dictionary
digital lexicography
al=qamus al=muHiyT
word patterns
TEI
6th IEEE International Congress on Information Science and Technology (IEEE CiSt), Innov.org, Agadir, MOROCCO, JUN 05-12, 2021
Elettronico
No
2
restricted
Nahli, Ouafae; DEL GROSSO, ANGELO MARIO
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
Creating_Arabic_Lexical_Resources_in_TEI_A_Schema_for_Discontinuous_Morphology_Encoding.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/394192
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact