An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked
Creating Arabic Lexical Resources in TEI; A Schema for Discontinuous Morphology Encoding
Ouafae Nahli
;Angelo Mario Del Grosso
2021
Abstract
An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Ouafae Nahli | en |
| dc.authority.people | Angelo Mario Del Grosso | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2024/02/21 08:55:30 | - |
| dc.date.available | 2024/02/21 08:55:30 | - |
| dc.date.firstsubmission | 2024/07/02 14:52:50 | * |
| dc.date.issued | 2021 | - |
| dc.date.submission | 2025/02/18 12:00:51 | * |
| dc.description.abstracteng | An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked | - |
| dc.description.affiliations | Institute for Computational Linguistics "A. Zampolli"; ILC-CNR, Pisa, Italy | - |
| dc.description.allpeople | Nahli, Ouafae; DEL GROSSO, ANGELO MARIO | - |
| dc.description.allpeopleoriginal | Ouafae Nahli ; Angelo Mario Del Grosso | en |
| dc.description.fulltext | restricted | en |
| dc.description.international | no | en |
| dc.description.note | 6th IEEE International Congress on Information Science and Technology (IEEE CiSt), Innov.org, Agadir, MOROCCO, JUN 05-12, 2021 | en |
| dc.description.numberofauthors | 2 | - |
| dc.identifier.doi | 10.1109/CiSt49399.2021.9357273 | en |
| dc.identifier.isbn | 978-1-7281-6646-9 | en |
| dc.identifier.isi | WOS:000657322100031 | - |
| dc.identifier.scopus | 2-s2.0-85103857819 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/394192 | - |
| dc.identifier.url | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9357273 | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2024-07-02T13:19:46Z | * |
| dc.publisher.country | USA | en |
| dc.publisher.name | IEEE | en |
| dc.publisher.place | 345 E 47TH ST, NEW YORK, NY 10017 USA | en |
| dc.relation.conferencedate | 5/06/2021 - 12/06/2021 | en |
| dc.relation.conferencename | IEEE-CIST2020 DPWH | en |
| dc.relation.conferenceplace | Agadir - Essaouira, Morocco | en |
| dc.relation.firstpage | 178 | en |
| dc.relation.ispartofbook | 2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20) | en |
| dc.relation.lastpage | 187 | en |
| dc.relation.medium | ELETTRONICO | en |
| dc.relation.numberofpages | 10 | en |
| dc.subject.keywordseng | classical Arabic dictionary | - |
| dc.subject.keywordseng | digital lexicography | - |
| dc.subject.keywordseng | al=qamus al=muHiyT | - |
| dc.subject.keywordseng | word patterns | - |
| dc.subject.keywordseng | TEI | - |
| dc.subject.singlekeyword | classical Arabic dictionary | * |
| dc.subject.singlekeyword | digital lexicography | * |
| dc.subject.singlekeyword | al=qamus al=muHiyT | * |
| dc.subject.singlekeyword | word patterns | * |
| dc.subject.singlekeyword | TEI | * |
| dc.title | Creating Arabic Lexical Resources in TEI; A Schema for Discontinuous Morphology Encoding | en |
| dc.type.circulation | Internazionale | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.impactfactor | si | en |
| dc.type.miur | 273 | - |
| dc.type.referee | Comitato scientifico | en |
| dc.ugov.descaux1 | 439789 | - |
| iris.isi.extIssued | 2020 | - |
| iris.isi.extTitle | Creating Arabic Lexical Resources in TEI | - |
| iris.isi.metadataErrorDescription | 0 | - |
| iris.isi.metadataErrorType | ERROR_NO_MATCH | - |
| iris.isi.metadataStatus | ERROR | - |
| iris.mediafilter.data | 2025/04/05 13:30:58 | * |
| iris.orcid.lastModifiedDate | 2025/03/02 08:06:18 | * |
| iris.orcid.lastModifiedMillisecond | 1740899178895 | * |
| iris.scopus.extIssued | 2020 | - |
| iris.scopus.extTitle | Creating arabic lexical resources in TEI: A schema for discontinuous morphology encoding | - |
| iris.scopus.ideLinkStatusDate | 2025/01/07 14:10:51 | * |
| iris.scopus.ideLinkStatusMillisecond | 1736255451280 | * |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.doi | 10.1109/cist49399.2021.9357273 | * |
| iris.unpaywall.isoa | false | * |
| iris.unpaywall.journalisindoaj | false | * |
| iris.unpaywall.metadataCallLastModified | 26/04/2025 05:31:06 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1745638266207 | - |
| iris.unpaywall.oastatus | closed | * |
| isi.category | NU | * |
| isi.category | EP | * |
| isi.category | ET | * |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.affiliation | Consiglio Nazionale delle Ricerche (CNR) | - |
| isi.contributor.country | Italy | - |
| isi.contributor.country | Italy | - |
| isi.contributor.name | Ouafae | - |
| isi.contributor.name | Angelo Mario | - |
| isi.contributor.researcherId | DXR-9191-2022 | - |
| isi.contributor.researcherId | P-7993-2018 | - |
| isi.contributor.subaffiliation | Inst Computat Linguist A Zampolli | - |
| isi.contributor.subaffiliation | Inst Computat Linguist A Zampolli | - |
| isi.contributor.surname | Nahli | - |
| isi.contributor.surname | Del Grosso | - |
| isi.date.issued | 2020 | * |
| isi.description.abstracteng | An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=q (a) over barm (u) over bars al=mu<(h)under dot>(i) over bar<(t)under dot>. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked. | * |
| isi.description.allpeopleoriginal | Nahli, O; Del Grosso, AM; | * |
| isi.document.sourcetype | WOS.ISTP | * |
| isi.document.type | Proceedings Paper | * |
| isi.document.types | Proceedings Paper | * |
| isi.identifier.doi | 10.1109/CIST49399.2021.9357273 | * |
| isi.identifier.isi | WOS:000657322100031 | * |
| isi.journal.journaltitle | 2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20) | * |
| isi.journal.journaltitleabbrev | COLLOQ INF SCI TECH | * |
| isi.language.original | English | * |
| isi.publisher.place | 345 E 47TH ST, NEW YORK, NY 10017 USA | * |
| isi.relation.firstpage | 178 | * |
| isi.relation.lastpage | 187 | * |
| isi.title | Creating Arabic Lexical Resources in TEI | * |
| scopus.category | 1711 | * |
| scopus.category | 1706 | * |
| scopus.category | 1803 | * |
| scopus.category | 1802 | * |
| scopus.contributor.affiliation | ILC-CNR | - |
| scopus.contributor.affiliation | ILC-CNR | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.auid | 56741333300 | - |
| scopus.contributor.auid | 56319538600 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | 122433597 | - |
| scopus.contributor.dptid | 122433597 | - |
| scopus.contributor.name | Ouafae | - |
| scopus.contributor.name | Angelo Mario | - |
| scopus.contributor.subaffiliation | Institute for Computational Linguistics A. Zampolli; | - |
| scopus.contributor.subaffiliation | Institute for Computational Linguistics A. Zampolli; | - |
| scopus.contributor.surname | Nahli | - |
| scopus.contributor.surname | Del Grosso | - |
| scopus.date.issued | 2020 | * |
| scopus.description.abstracteng | An Arabic word can be described according to its lexical and morphological information. Lexical analysis consists in gathering both semantic information (meaning and translation) and syntactic properties (parts of speech). Morphological analysis, instead, identifies word patterns that group the words having the same syntactic, inflectional and semantic behaviour. Such descriptions constitute two different but complementary levels of study. This paper illustrates our work, aimed at creating an exhaustive resource consisting of two levels: lexical and morphological. The lexical level collects information extracted from the dictionary al=qamus al=muhit. The morphological level describes the word patterns. The two levels are autonomous but complementary. Each word described at the lexical level is linked to its corresponding pattern. The formalization of the word pattern makes it possible to enrich word descriptions with additional morphosyntactic and inflectional information. To obtain a digital systematic resource, we followed the guidelines provided by the Text Encoding Initiative (TEI). We adopted the TEI module devoted to encoding digital dictionaries and lexicons in order to formally represent the medieval primary source al=qamus al=muhit. We also used the TEI interpretation approach to encode the morphological word patterns keeping the two levels separate but at the same time allowing them to be linked. | * |
| scopus.description.allpeopleoriginal | Nahli O.; Del Grosso A.M. | * |
| scopus.differences | scopus.publisher.name | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.relation.conferencedate | * |
| scopus.differences | scopus.description.allpeopleoriginal | * |
| scopus.differences | scopus.description.abstracteng | * |
| scopus.differences | scopus.relation.conferencename | * |
| scopus.differences | scopus.identifier.isbn | * |
| scopus.differences | scopus.date.issued | * |
| scopus.differences | scopus.relation.conferenceplace | * |
| scopus.differences | scopus.title | * |
| scopus.differences | scopus.relation.volume | * |
| scopus.document.type | cp | * |
| scopus.document.types | cp | * |
| scopus.identifier.doi | 10.1109/CiSt49399.2021.9357273 | * |
| scopus.identifier.eissn | 2327-1884 | * |
| scopus.identifier.isbn | 9781728166469 | * |
| scopus.identifier.pui | 634718962 | * |
| scopus.identifier.scopus | 2-s2.0-85103857819 | * |
| scopus.journal.sourceid | 21100400809 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | Institute of Electrical and Electronics Engineers Inc. | * |
| scopus.relation.article | 9357273 | * |
| scopus.relation.conferencedate | 2020 | * |
| scopus.relation.conferencename | 6th International IEEE Congress on Information Science and Technology, CiSt 2020 | * |
| scopus.relation.conferenceplace | mar | * |
| scopus.relation.firstpage | 178 | * |
| scopus.relation.lastpage | 187 | * |
| scopus.relation.volume | 2020- | * |
| scopus.subject.keywords | Al=qāmūs al=muī; Classical Arabic dictionary; Digital lexicography; TEI; Word patterns; | * |
| scopus.title | Creating arabic lexical resources in TEI: A schema for discontinuous morphology encoding | * |
| scopus.titleeng | Creating arabic lexical resources in TEI: A schema for discontinuous morphology encoding | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
Creating_Arabic_Lexical_Resources_in_TEI_A_Schema_for_Discontinuous_Morphology_Encoding.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
1.26 MB
Formato
Adobe PDF
|
1.26 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


