Abstract - The objective of the project is twofold: on the one hand, the creation and elaboration of software procedures for the Arabic language and, on the other hand, the creation of linguistic resources for the management of large Arabic corpora. The linguistic resources are substantially the following: a) Morphological engine for the Arabic language. The engine is constituted by a number of modules: the algorithms and modules for generation and analysis, an appropriate encoding system for the representation of lexical data and of morphological characteristics of Arabic, the so-called lemmario, i.e. the archive of lemmas; b) The automatic alignment of parallel texts in Italian and Arabic language; c) Automatic tagging of Arabic texts, performed by using the above morphological engine; d) Systems for accessing and querying (raw and/or tagged) Arabic texts and parallel Italian-Arabic corpora.
Risorse monolingui e multilingui. Corpus bilingue italiano-arabo
Picchi E;Sassolini E;Nahli O;
2003
Abstract
Abstract - The objective of the project is twofold: on the one hand, the creation and elaboration of software procedures for the Arabic language and, on the other hand, the creation of linguistic resources for the management of large Arabic corpora. The linguistic resources are substantially the following: a) Morphological engine for the Arabic language. The engine is constituted by a number of modules: the algorithms and modules for generation and analysis, an appropriate encoding system for the representation of lexical data and of morphological characteristics of Arabic, the so-called lemmario, i.e. the archive of lemmas; b) The automatic alignment of parallel texts in Italian and Arabic language; c) Automatic tagging of Arabic texts, performed by using the above morphological engine; d) Systems for accessing and querying (raw and/or tagged) Arabic texts and parallel Italian-Arabic corpora.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.