CNR Institutional Research Information System

Abstract - The objective of the project is twofold: on the one hand, the creation and elaboration of software procedures for the Arabic language and, on the other hand, the creation of linguistic resources for the management of large Arabic corpora. The linguistic resources are substantially the following: a) Morphological engine for the Arabic language. The engine is constituted by a number of modules: the algorithms and modules for generation and analysis, an appropriate encoding system for the representation of lexical data and of morphological characteristics of Arabic, the so-called lemmario, i.e. the archive of lemmas; b) The automatic alignment of parallel texts in Italian and Arabic language; c) Automatic tagging of Arabic texts, performed by using the above morphological engine; d) Systems for accessing and querying (raw and/or tagged) Arabic texts and parallel Italian-Arabic corpora.

Risorse monolingui e multilingui. Corpus bilingue italiano-arabo

Picchi E;Sassolini E;Nahli O;Cucurullo S

2003

Abstract

Abstract - The objective of the project is twofold: on the one hand, the creation and elaboration of software procedures for the Arabic language and, on the other hand, the creation of linguistic resources for the management of large Arabic corpora. The linguistic resources are substantially the following: a) Morphological engine for the Arabic language. The engine is constituted by a number of modules: the algorithms and modules for generation and analysis, an appropriate encoding system for the representation of lexical data and of morphological characteristics of Arabic, the so-called lemmario, i.e. the archive of lemmas; b) The automatic alignment of parallel texts in Italian and Arabic language; c) Automatic tagging of Arabic texts, performed by using the above morphological engine; d) Systems for accessing and querying (raw and/or tagged) Arabic texts and parallel Italian-Arabic corpora.

Scheda breve

Scheda completa

Scheda completa (DC)

Campo DC	Valore	Lingua
dc.authority.orgunit	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	-
dc.authority.people	Picchi E	it
dc.authority.people	Sassolini E	it
dc.authority.people	Nahli O	it
dc.authority.people	Cucurullo S	it
dc.collection.id.s	b3f88f24-048a-4e43-8ab1-6697b90e068e	*
dc.collection.name	01.01 Articolo in rivista	*
dc.contributor.appartenenza	Istituto di linguistica computazionale "Antonio Zampolli" - ILC	*
dc.contributor.appartenenza.mi	918	*
dc.date.accessioned	2024/02/20 18:19:47	-
dc.date.available	2024/02/20 18:19:47	-
dc.date.issued	2003	-
dc.description.abstract	Abstract - The objective of the project is twofold: on the one hand, the creation and elaboration of software procedures for the Arabic language and, on the other hand, the creation of linguistic resources for the management of large Arabic corpora. The linguistic resources are substantially the following: a) Morphological engine for the Arabic language. The engine is constituted by a number of modules: the algorithms and modules for generation and analysis, an appropriate encoding system for the representation of lexical data and of morphological characteristics of Arabic, the so-called lemmario, i.e. the archive of lemmas; b) The automatic alignment of parallel texts in Italian and Arabic language; c) Automatic tagging of Arabic texts, performed by using the above morphological engine; d) Systems for accessing and querying (raw and/or tagged) Arabic texts and parallel Italian-Arabic corpora.	-
dc.description.affiliations	Istituto di Linguistica Computazionale "A. Zampolli", Consiglio Nazionale delle Ricerche	-
dc.description.allpeople	Picchi, E; Sassolini, E; Nahli, O; Cucurullo, S	-
dc.description.allpeopleoriginal	Picchi E. , Sassolini E. , Nahli O. , Cucurullo S.	-
dc.description.fulltext	none	en
dc.description.note	La linea di ricerca descritta nellarticolo, che ha portato allo sviluppo di tutto un insieme di strumenti e di risorse per il trattamento della lingua araba, ha aperto nuove line di collaborazione in varie direzioni: in chiave internazionale ha permesso allILC di collaborare con il progetto NEMLAR (Network for Euro-Mediterranean Language Resource and human language technology develpment and support) sponsorizzato dalla comunità europea; in Italia sono in corso contatti con industrie del settore tecnologico per collaborare alla creazione di strumenti di analisi automatica di grandi quantità di documenti in lingua araba (procedure di text mining); in collaborazione con lUniversità della Calabria è stato preparato un progetto FIRB, sottoposto al MIUR, per la estrazione automatica di glossari bilingui italo-arabi a partire da materiale testuale bilingue.	-
dc.description.numberofauthors	4	-
dc.identifier.uri	https://hdl.handle.net/20.500.14243/433717	-
dc.relation.firstpage	629	-
dc.relation.lastpage	678	-
dc.relation.volume	18-19	-
dc.subject.keywords	Morfologia araba	-
dc.subject.keywords	Corpora bilingui	-
dc.subject.keywords	Analisi testuale	-
dc.subject.keywords	Aligner	-
dc.subject.keywords	Tagger	-
dc.subject.singlekeyword	Morfologia araba	*
dc.subject.singlekeyword	Corpora bilingui	*
dc.subject.singlekeyword	Analisi testuale	*
dc.subject.singlekeyword	Aligner	*
dc.subject.singlekeyword	Tagger	*
dc.title	Risorse monolingui e multilingui. Corpus bilingue italiano-arabo	en
dc.type.driver	info:eu-repo/semantics/article	-
dc.type.full	01 Contributo su Rivista::01.01 Articolo in rivista	it
dc.type.miur	262	-
dc.type.referee	Sì, ma tipo non specificato	-
dc.ugov.descaux1	64493	-
iris.orcid.lastModifiedDate	2024/04/04 15:52:45	*
iris.orcid.lastModifiedMillisecond	1712238765447	*
iris.sitodocente.maxattempts	1	-
Appare nelle tipologie:	01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/433717

Citazioni

ND

ND

ND

social impact