Le convenzioni ortografiche della lingua araba consentono l'omissione dei diacritici, introducendo così numerosi casi di omografia tra forme flesse e la conseguente proliferazione di analisi morfologiche contestualmente spurie. Un analizzatore morfologico che utilizzi i vincoli ortografici, morfo-sintattici e semantici che operano a livello lessicale, può tuttavia ridurre drasticamente il livello di ambiguità morfologica del testo scritto, producendo analisi più efficienti e accurate.
The script-based and morphological characteristics of the Arabic language increase considerably the number of alternative analyses output by any morphological parser that does not use orthographic, syntactic and semantic constraints. In order to reduce time-wasting and error-prone proliferation of multiple outputs to be filtered in a post-processing phase, we have tried to optimize word processing by providing the morphological parser with multiple levels of information. We have operated at three such levels: orthography, morpho-syntax and semantics.
Improved Written Arabic Word Parsing through Orthographic, Syntactic and Semantic constraints
Ouafae Nahli;Simone Marchi
2015
Abstract
The script-based and morphological characteristics of the Arabic language increase considerably the number of alternative analyses output by any morphological parser that does not use orthographic, syntactic and semantic constraints. In order to reduce time-wasting and error-prone proliferation of multiple outputs to be filtered in a post-processing phase, we have tried to optimize word processing by providing the morphological parser with multiple levels of information. We have operated at three such levels: orthography, morpho-syntax and semantics.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Ouafae Nahli | it |
| dc.authority.people | Simone Marchi | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/19 23:41:17 | - |
| dc.date.available | 2024/02/19 23:41:17 | - |
| dc.date.issued | 2015 | - |
| dc.description.abstract | Le convenzioni ortografiche della lingua araba consentono l'omissione dei diacritici, introducendo così numerosi casi di omografia tra forme flesse e la conseguente proliferazione di analisi morfologiche contestualmente spurie. Un analizzatore morfologico che utilizzi i vincoli ortografici, morfo-sintattici e semantici che operano a livello lessicale, può tuttavia ridurre drasticamente il livello di ambiguità morfologica del testo scritto, producendo analisi più efficienti e accurate. | - |
| dc.description.abstractita | The script-based and morphological characteristics of the Arabic language increase considerably the number of alternative analyses output by any morphological parser that does not use orthographic, syntactic and semantic constraints. In order to reduce time-wasting and error-prone proliferation of multiple outputs to be filtered in a post-processing phase, we have tried to optimize word processing by providing the morphological parser with multiple levels of information. We have operated at three such levels: orthography, morpho-syntax and semantics. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale | - |
| dc.description.allpeople | Nahli, Ouafae; Marchi, Simone | - |
| dc.description.allpeopleoriginal | Ouafae Nahli, Simone Marchi | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 2 | - |
| dc.identifier.isbn | 9788899200626 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/300996 | - |
| dc.identifier.url | http://www.aaccademia.it/elenco-libri?aaref=CLIC_2015 | - |
| dc.language.iso | eng | - |
| dc.publisher.country | ITA | - |
| dc.publisher.name | Accademia University Press | - |
| dc.publisher.place | Torino | - |
| dc.relation.conferencedate | 3-4 Dicembre 2015 | - |
| dc.relation.conferencename | Second Italian Conference on Computational Linguistics CLiC-it 2015 | - |
| dc.relation.conferenceplace | Trento | - |
| dc.relation.firstpage | 210 | - |
| dc.relation.lastpage | 214 | - |
| dc.relation.numberofpages | 5 | - |
| dc.subject.keywords | Arabic Language | - |
| dc.subject.keywords | Arabic NLP | - |
| dc.subject.keywords | Orthography | - |
| dc.subject.keywords | Morpho-syntax | - |
| dc.subject.keywords | Semantics | - |
| dc.subject.singlekeyword | Arabic Language | * |
| dc.subject.singlekeyword | Arabic NLP | * |
| dc.subject.singlekeyword | Orthography | * |
| dc.subject.singlekeyword | Morpho-syntax | * |
| dc.subject.singlekeyword | Semantics | * |
| dc.title | Improved Written Arabic Word Parsing through Orthographic, Syntactic and Semantic constraints | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 342436 | - |
| iris.orcid.lastModifiedDate | 2024/04/04 15:34:09 | * |
| iris.orcid.lastModifiedMillisecond | 1712237649500 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


